Biosynthesis of cannabinoids and cannabinoid precursors

ABSTRACT

Aspects of the disclosure relate to biosynthesis of cannabinoids and cannabinoid precursors in recombinant cells and in vitro.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation of International Application SerialNumber PCT/US2020/019760, filed Feb. 25, 2020, entitled “BIOSYNTHESIS OFCANNABINOIDS AND CANNABINOID PRECURSORS,” which claims the benefit under35 U.S.C. § 119(e) of U.S. Provisional Application Ser. No. 62/810,367,filed Feb. 25, 2019, entitled “BIOSYNTHESIS OF CANNABINOIDS ANDCANNABINOID PRECURSORS,” and U.S. Provisional Application Ser. No.62/810,938, filed Feb. 26, 2019, entitled “BIOSYNTHESIS OF CANNABINOIDSAND CANNABINOID PRECURSORS,” the disclosure of each of which isincorporated by reference herein in its entirety.

REFERENCE TO A SEQUENCE LISTING SUBMITTED AS A TEXT FILE VIA EFS-WEB

The instant application contains a Sequence Listing which has beensubmitted in text format via EFS-Web and is hereby incorporated byreference in its entirety. Said text copy, created on Oct. 23, 2020, isnamed G091970030US07-SEQ-FL.txt and is 1,806,581 bytes in size.

FIELD OF INVENTION

The present disclosure relates to the biosynthesis of cannabinoids andcannabinoid precursors in recombinant cells.

BACKGROUND

Cannabinoids are chemical compounds that may act as ligands forendocannabinoid receptors and have multiple medical applications.Traditionally, cannabinoids have been isolated from plants of the genusCannabis. The use of plants for producing cannabinoids is inefficient,however, with isolated products restricted to the primary endogenousCannabis compounds, and the cultivation of Cannabis plants is restrictedin many jurisdictions. Cannabinoids can also be produced throughchemical synthesis (see, e.g., U.S. Pat. No. 7,323,576 to Souza et al).However, such methods suffer from low yields and high cost. Productionof cannabinoids, cannabinoid analogs, and cannabinoid precursors usingengineered organisms may provide an advantageous approach to meet theincreasing demand for these compounds.

SUMMARY

Aspects of the present disclosure provide methods for production ofcannabinoids and cannabinoid precursors from fatty acid substrates usinggenetically modified host cells.

Aspects of the present disclosure provide host cells that comprises aheterologous polynucleotide encoding a polyketide synthase (PKS),wherein the PKS comprises a sequence that is at least 90% identical toSEQ ID NO: 7. In some embodiments, relative to the sequence of SEQ IDNO: 7, the PKS comprises an amino acid substitution at a residuecorresponding to position 34, 50, 70, 71, 76, 100, 151, 203, 219, 285,359, and/or 385 in SEQ ID NO: 7. In some embodiments, the PKS comprises:the amino acid Q at a residue corresponding to position 34 in SEQ ID NO:7; the amino acid N at a residue corresponding to position 50 in SEQ IDNO: 7; the amino acid M at a residue corresponding to position 70 in SEQID NO: 7; the amino acid Y at a residue corresponding to position 71 inSEQ ID NO: 7; the amino acid I at a residue corresponding to position 76in SEQ ID NO: 7; the amino acid P or T at a residue corresponding toposition 100 in SEQ ID NO: 7; the amino acid P at a residuecorresponding to position 151 in SEQ ID NO: 7; the amino acid K at aresidue corresponding to position 203 in SEQ ID NO: 7; the amino acid Cat a residue corresponding to position 219 in SEQ ID NO: 7; the aminoacid A at a residue corresponding to position 285 in SEQ ID NO: 7; theamino acid M at a residue corresponding to position 359 in SEQ ID NO: 7;and/or the amino acid M at a residue corresponding to position 385 inSEQ ID NO: 7.

In some embodiments, the PKS is capable of producing:

a) a compound of Formula (4):

b) a compound of Formula (5):

c) a compound of Formula (6):

In some embodiments,

a) the compound of Formula (4) is the compound for Formula (4a):

b) the compound of Formula (5) is the compound for Formula (5a):

and/orc) the compound of Formula (6) is the compound of Formula (6a):

In some embodiments, the host cell produces more of a compound ofFormula (5) than a host cell that comprises a heterologouspolynucleotide encoding a PKS that comprises the sequence of SEQ ID NO:7. In some embodiments, the PKS comprises one or more of the followingamino acid substitutions relative to SEQ ID NO: 7: V71Y and F70M. Insome embodiments, the PKS comprises: C at a residue corresponding toposition 164 in SEQ ID NO: 7; H at a residue corresponding to position304 in SEQ ID NO: 7; and/or N at a residue corresponding to position 337in SEQ ID NO: 7. In some embodiments, the PKS comprises SEQ ID NO: 7. Insome embodiments, the PKS comprises SEQ ID NO: 15 or 145. In someembodiments, the heterologous polynucleotide comprises a sequence thatis at least 90% identical to SEQ ID NOs: 38 or 176.

Aspects of the present disclosure relate to host cell that comprises aheterologous polynucleotide encoding a polyketide synthase (PKS),wherein the PKS comprises a sequence that is at least 90% identical toSEQ ID NO: 714. In some embodiments, the PKS is capable of producing:

a. a compound of Formula (4):

b. a compound of Formula (5):

c. a compound of Formula (6):

In some embodiments,

a) the compound of Formula (4) is the compound for Formula (4a):

b) the compound of Formula (5) is the compound for Formula (5a):

and/or

c) the compound of Formula (6) is the compound of Formula (6a):

Further aspects of the present disclosure provide host cells thatcomprises a heterologous polynucleotide encoding a polyketide synthase(PKS), wherein relative to the sequence of SEQ ID NO: 5 the PKScomprises one or more amino acid substitutions within the active site ofthe PKS, and wherein the host cell is capable of producing a compound ofFormula (4), (5), or (6).

In some embodiments, relative to SEQ ID NO: 5, the PKS comprises anamino acid substitution at one or more of the following positions in SEQID NO: 5: 17, 23, 25, 51, 54, 64, 95, 123, 125, 153, 196, 201, 207, 241,247, 267, 273, 277, 296, 307, 320, 324, 326, 328, 334, 335C, and 375. Insome embodiments, relative to SEQ ID NO:5, the PKS comprises: T17K,I23C, L25R, K51R, D54R, F64Y, V95A, T123C, A125S, Y153G, E196K, L201C,I207L, L241I, T247A, M267K, M267G, I273V, L277M, T296A, V307I, D320A,V324I, S326R, H328Y, S334P, S334A, T335C, R375T, or any combinationthereof. In some embodiments, relative to SEQ ID NO: 5, the PKS furthercomprises an amino acid substitution at one or more of the followingpositions in SEQ ID NO: 5: 284, 100, 116, 278, 108, 348, 71, 92, 128,100, 135, 229, 128, and 128. In some embodiments, relative to SEQ IDNO:5, the PKS comprises: I284Y, K100L, K116R, I278E, K108D, L348S, K71R,V92G, T128V, K100M, Y135V, P229A, T128A, T128I, or any combinationthereof. In some embodiments,

a) the compound of Formula (4) is the compound of Formula (4a):

b) the compound of Formula (5) is the compound of Formula (5a):

and/or

c) the compound of Formula (6) is the compound of Formula (6a):

In some embodiments, the host cell produces more of a compound ofFormula (5) than a host cell that comprises a heterologouspolynucleotide encoding a PKS that comprises the sequence of SEQ ID NO:5. In some embodiments, the PKS comprises at least 90% to any one SEQ IDNOs: 207-249. In some embodiments, the heterologous polynucleotidecomprises a sequence that is at least 90% identical to SEQ ID NOs:250-292.

Further aspects of the present disclosure relate to host cells thatcomprises a heterologous polynucleotide encoding a polyketide synthase(PKS), wherein relative to the sequence of SEQ ID NO: 5 the PKScomprises the amino acid substitution T335C. In some embodiments, thePKS is at least 90% identical to SEQ ID NO: 207. In some embodiments,the PKS comprises SEQ ID NO: 207.

Further aspects of the present disclosure relate to host cells thatcomprises a heterologous polynucleotide encoding a polyketide synthase(PKS), wherein the PKS comprises the amino acid C at a residuecorresponding to position 335 of SEQ ID NO: 5, and wherein the host cellis capable of producing more of a compound of Formula (5) than a hostcell that comprises a heterologous polynucleotide encoding a PKScomprising SEQ ID NO: 5.

In some embodiments, the PKS comprises a sequence that is at least 90%identical to any one of SEQ ID NOs: 7, 13, 145, 8, and 15. In someembodiments, the PKS comprises a sequence that is at least 90% identicalto SEQ ID NO: 5. In some embodiments, the compound of Formula (5) is thecompound of Formula (5a):

In some embodiments, the heterologous polynucleotide comprises asequence that is at least 90% identical to SEQ ID NO: 250 or 706.

Further aspects of the present disclosure provide host cells thatcomprises a heterologous polynucleotide encoding a polyketide synthase(PKS), wherein the PKS is capable of reacting a compound of Formula (2)with a compound of Formula (3):

to produce a compound of Formula (6):

In some embodiments, the PKS comprises a sequence that is at least 90%identical to SEQ ID NO: 6. In some embodiments, the PKS comprises theamino acid W at a residue corresponding to position 339 of SEQ ID NO: 6.In some embodiments, the PKS comprises: C at a residue corresponding toposition 164 in SEQ ID NO: 6; H at a residue corresponding to position304 in SEQ ID NO: 6; and/or N at a residue corresponding to position 337in SEQ ID NO: 6. In some embodiments, the PKS is capable of producing:

a compound of Formula (4):

b) or a compound of Formula (5):

In some embodiments,

the compound of Formula (6) is a compound for Formula (6a):

In some embodiments,

a) the compound of Formula (4) is a compound for Formula (4a):

b) the compound of Formula (5) is a compound for Formula (5a):

In some embodiments,

a. the compound of Formula (2) is a compound of Formula (2a):

b. the compound of Formula (3) is a compound of Formula (3a):

In some embodiments, the host cell produces a ratio of compound (6) tocompound (5) that is higher than the ratio produced by a host cell thatcomprises a heterologous polynucleotide encoding a PKS that comprisesthe sequence of SEQ ID NO: 6. In some embodiments, the PKS comprises SEQID NO: 6. In some embodiments, the heterologous polynucleotide comprisesa sequence that is at least 90% identical to SEQ ID NO: 37 or 186.

Further aspects of the disclosure provide host cells that comprises aheterologous polynucleotide encoding an acyl activating enzyme (AAE),wherein the AAE comprising a sequence that is at least 90% identical toa sequence selected from SEQ ID NOs: 63-69, 141-142, and 707-708.

Further aspects of the disclosure provide host cell that comprises aheterologous polynucleotide comprising a sequence that is at least 90%identical to a sequence selected from SEQ ID NOs: 70-76 and 712-713.

Further aspects of the disclosure provide host cells that comprises aheterologous polynucleotide encoding an acyl activating enzyme (AAE),wherein the AAE comprises: the amino acid sequence SGAAPLG (SEQ ID NO:114); the amino acid sequence AYLGMSSGTSGG (SEQ ID NO: 115); the aminoacid sequence DQPA (SEQ ID NO: 116); the amino acid sequence QVAPAELE(SEQ ID NO: 117); the amino acid sequence VVID (SEQ ID NO: 118); and/orthe amino acid sequence SGKILRRLLR (SEQ ID NO: 119). In someembodiments, the host cell produces at least 10%, 20%, 30%, 40%, 50%,60%, 70%, 80%, 90%, or 100% more hexanoyl-coenzyme A in the presence ofhexanoic acid and Coenzyme A relative to a recombinant host cell thatdoes not comprise a heterologous gene encoding an AAE and/or the hostcell produces at least 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, or100% more butanoyl-coenzyme A in the presence of butyric acid andCoenzyme A relative to a recombinant host cell that does not comprise aheterologous gene encoding an AAE.

In some embodiments, the AAE comprises: the amino acid sequence SGAAPLG(SEQ ID NO: 114) at residues corresponding to positions 319-325 in SEQID NO:64; the amino acid sequence AYLGMSSGTSGG (SEQ ID NO: 115) atresidues corresponding to positions 194-205 in SEQ ID NO:64; the aminoacid sequence DQPA (SEQ ID NO: 116) at residues corresponding topositions 398-401 in SEQ ID NO:64; the amino acid sequence QVAPAELE (SEQID NO: 117) at residues corresponding to positions 495-502 in SEQ IDNO:64; the amino acid sequence VVID (SEQ ID NO: 118) at residuescorresponding to positions 564-567 in SEQ ID NO:64; and/or the aminoacid sequence SGKILRRLLR (SEQ ID NO: 119) at residues corresponding topositions 574-583 in SEQ ID NO:64.

Further aspects of the disclosure provide host cells that comprises aheterologous polynucleotide encoding an acyl activating enzyme (AAE),wherein the AAE comprises: an amino acid sequence with no more thanthree amino acid substitutions at residues corresponding to positions428-440 in SEQ ID NO:64; or an amino acid sequence with no more than oneamino acid substitution at residues corresponding to positions 482-491in SEQ ID NO:64, wherein the host cell produces at least 10%, 20%, 30%,40%, 50%, 60%, 70%, 80%, 90%, or 100% more hexanoyl-coenzyme A in thepresence of hexanoic acid and Coenzyme A relative to a recombinant hostcell that does not comprise a heterologous gene encoding an AAE; and/orwherein the host cell produces at least 10%, 20%, 30%, 40%, 50%, 60%,70%, 80%, 90%, or 100% more butanoyl-coenzyme A in the presence ofbutyric acid and Coenzyme A relative to a recombinant host cell thatdoes not comprise a heterologous gene encoding an AAE.

In some embodiments, the AAE comprises: I or V at a residuecorresponding to position 432 in SEQ ID NO:64; S or D at a residuecorresponding to position 434 in SEQ ID NO:64; K or N at a residuecorresponding to position 438 in SEQ ID NO:64; and/or L or M at aresidue corresponding to position 488 in SEQ ID NO:64. In someembodiments, the AAE comprises: the amino acid sequence RGPQIMSGYHKNP(SEQ ID NO: 120); the amino acid sequence RGPQVMDGYHNNP (SEQ ID NO:121); the amino acid sequence RGPQIMDGYHKNP (SEQ ID NO: 122); the aminoacid sequence VDRTKELIKS (SEQ ID NO: 123); and/or the amino acidsequence VDRTKEMIKS (SEQ ID NO: 124). In some embodiments, the AAEcomprises a sequence that is at least 90% identical to a sequenceselected from SEQ ID NOs: 63-69, 141-142, and 707-708. In someembodiments, the AAE comprises at least one conservative substitutionrelative to the sequence of SEQ ID NO:64. In some embodiments, the hostcell further comprises one or more heterologous polynucleotides encodingone or more of: a polyketide synthase (PKS), a polyketide cyclase (PKC);a prenyltransferase (PT); and/or a terminal synthase (TS).

Further aspects of the present disclosure provide methods comprisingculturing any of the host cells of the disclosure. In some embodiments,the host cell is cultured in media comprising sodium hexanoate. In someembodiments, the host cell is a plant cell, an algal cell, a yeast cell,a bacterial cell, or an animal cell. In certain embodiments, the hostcell is a yeast cell. In some embodiments, the yeast cell is aSaccharomyces cell, a Yarrowia cell, a Pichia cell or a Komagataellacell. In certain embodiments, wherein the Saccharomyces cell is aSaccharomyces cerevisiae cell. In some embodiments, the host cell is abacterial cell. In certain embodiments, the bacterial cell is an E. colicell. In some embodiments, the host cell further comprises one or moreheterologous polynucleotides encoding one or more of: an acyl activatingenzyme (AAE), a polyketide cyclase (PKC), a prenyltransferase (PT),and/or a terminal synthase (TS).

Further aspects of the present disclosure provide non-naturallyoccurring nucleic acid encoding a polyketide synthase (PKS), wherein thenon-naturally occurring nucleic acid comprises at least 90% identity toSEQ ID NOs: 32-62, 93-108, 172-206, 250-292, 421-548, 628-705 or 706.Further aspects of the present disclosure provide vectors comprisingnon-naturally occurring nucleic acids of the disclosure. Further aspectsof the present disclosure provide expression cassettes comprisingnon-naturally occurring nucleic acids of the disclosure. Further aspectsof the present disclosure provide host cells transformed withnon-naturally occurring nucleic acids, vector, or expression cassettesof the present disclosure. Further aspects of the present disclosureprovide host cells that comprise non-naturally occurring nucleic acids,vector, or expression cassettes of the present disclosure.

Each of the limitations of the invention can encompass variousembodiments of the invention. It is, therefore, anticipated that each ofthe limitations of the invention involving any one element orcombinations of elements can be included in each aspect of theinvention. This disclosure is not limited in its application to thedetails of construction and the arrangement of components set forth inthe following description or illustrated in the drawings. The inventionis capable of other embodiments and of being practiced or of beingcarried out in various ways. Also, the phraseology and terminology usedin this application is for the purpose of description and should not beregarded as limiting. The use of “including,” “comprising,” or “having,”“containing,” “involving,” and variations thereof, is meant to encompassthe items listed thereafter and equivalents thereof as well asadditional items.

BRIEF DESCRIPTION OF DRAWINGS

The accompanying drawings are not intended to be drawn to scale. In thedrawings, each identical or nearly identical component that isillustrated in various figures is represented by a like numeral. Forpurposes of clarity, not every component may be labeled in everydrawing. In the drawings:

FIG. 1 is a schematic depicting the native Cannabis biosynthetic pathwayfor production of cannabinoid compounds, including five enzymatic stepsmediated by: (Rla) acyl activating enzymes (AAE); (R2a) olivetolsynthase enzymes (OLS); (R3a) olivetolic acid cyclase enzymes (OAC);(R4a) cannabigerolic acid synthase enzymes (CBGAS); and (R5a) terminalsynthase enzymes (TS). Formulae 1a-11a correspond to hexanoic acid (1a),hexanoyl-CoA (2a), malonyl-CoA (3a), 3,5,7-trioxododecanoyl-CoA (4a),olivetol (5a), olivetolic acid (6a), geranyl pyrophosphate (7a),cannabigerolic acid (8a), cannabidiolic acid (9a),tetrahydrocannabinolic acid (10a), and cannabichromenic acid (11a).Hexanoic acid is an exemplary carboxylic acid substrate; othercarboxylic acids may also be used (e.g., butyric acid, isovaleric acid,octanoic acid, decanoic acid, etc.; see e.g., FIG. 3 below). The enzymesthat catalyze the synthesis of 3,5,7-trioxododecanoyl-CoA and olivetolicacid are shown in R2a and R3a, respectively, and can includemulti-functional enzymes that catalyze the synthesis of3,5,7-trioxododecanoyl-CoA and olivetolic acid. The enzymescannabidiolic acid synthase (CBDAS), tetrahydrocannabinolic acidsynthase (THCAS), and cannabichromenic acid synthase (CBCAS) thatcatalyze the synthesis of cannabidiolic acid, tetrahydrocannabinolicacid, and cannabichromenic acid, respectively, are shown in step R5a.FIG. 1 is adapted from Carvalho et al. “Designing Microorganisms forHeterologous Biosynthesis of Cannabinoids” (2017) FEMS Yeast ResearchJune 1; 17(4), which is incorporated by reference in its entirety.

FIG. 2 is a schematic depicting a heterologous biosynthetic pathway forproduction of cannabinoid compounds, including five enzymatic stepsmediated by: (R1) acyl activating enzymes (AAE); (R2) polyketidesynthase enzymes (PKS) or bifunctional polyketide synthase-polyketidecyclase enzymes (PKS-PKC); (R3) polyketide cyclase enzymes (PKC) orbifunctional PKS-PKC enzymes; (R4) prenyltransferase enzymes (PT); and(R5) terminal synthase enzymes (TS). Any carboxylic acid of varyingchain lengths, structures (e.g., aliphatic, alicyclic, or aromatic) andfunctionalization (e.g., hydroxylic-, keto-, amino-, thiol-, aryl-, oralogeno-) may also be used as precursor substrates (e.g., thiopropionicacid, hydroxy phenyl acetic acid, norleucine, bromodecanoic acid,butyric acid, isovaleric acid, octanoic acid, decanoic acid, etc).

FIG. 3 is a non-exclusive representation of select putative precursorsfor the cannabinoid pathway in FIG. 2.

FIG. 4 is a graph showing activity of E. coli strains expressingcandidate AAEs as measured by a 5,5′-dithiobis-(2-nitrobenzoic acid)(“DTNB”) assay. Lysates of E. coli expressing candidate AAEs wereassayed for ligation activity of free CoA to either butyrate orhexanoate. Activity was quantified by measuring the decrease inabsorbance at 412 nm, corresponding to a decrease of free CoA insolution. Error bars represent the standard deviation of 2 independentmeasurements. Negative control strain t49568 expresses an aldehydedehydrogenase protein from Y. lipolytica (corresponding to Uniprot IDQ6C5T1).

FIGS. 5A-5B show a plasmid used to express AAE and OLS proteins in S.cerevisiae. The coding sequence for the enzyme being expressed (labeled“Library gene”) is driven by the GAL1 promoter. The plasmid containsmarkers for both yeast (URA3) and bacteria (ampR), as well as origins ofreplication for yeast (2 micron), bacteria (pBR322), and phage (f1).

FIG. 6 is a graph showing activity of S. cerevisiae strains expressingcandidate AAEs as measured by a DTNB assay. Lysates of S. cerevisiaeexpressing candidate AAEs were assayed for ligation activity of free CoAto hexanoate. Activity was quantified by measuring the decrease inabsorbance at 412 nm, corresponding to a decrease of free CoA insolution. Error bars represent the standard deviation of 3 independentmeasurements. Negative control strain t390338 expresses GFP.

FIGS. 7A-7C show a sequence alignment of acyl activating enzymes (AAEs).An alignment of t51477 (SEQ ID NO: 65), t49578 (SEQ ID NO: 63), t49594(SEQ ID NO: 64), t392878 (SEQ ID NO: 68), t392879 (SEQ ID NO: 69),t55127 (SEQ ID NO: 66), and t55128 (SEQ ID NO: 67) is shown. Thesequence alignment was conducted using Clustal Omega. See, e.g.,Chojnacki et al., Nucleic Acids Res. 2017 Jul. 3; 45(W1):W550-W553.

FIG. 8 is a graph showing olivetol production by S. cerevisiae strainsexpressing OLS candidate enzymes. Peak areas obtained via LC/MSquantification were normalized to an internal standard for olivetol.Normalized peak areas were further normalized to a positive controlstrain (t339582) contained on each plate. As explained in Example 2 andin Table 6, OLS candidate enzymes within the library depicted in thisFigure, and the positive control OLSs depicted in this Figure, werelater found to contain a deletion in the nucleotide sequence encodingthe OLS proteins, which led to the production of truncated proteins.Accordingly, all candidate OLS enzymes in this library, and the positivecontrols, were also tested independently in a new library containingonly full-length OLS sequences, described in Example 3.

FIG. 9 is a graph showing olivetolic acid (OA) production by S.cerevisiae strains expressing OLS candidate enzymes. Peak areas obtainedvia LC/MS quantification were normalized to an internal standard for OA.Normalized peak areas were further normalized to a positive controlstrain (t339582) contained on each plate. As explained in Example 2 andin Table 6, OLS candidate enzymes within the library depicted in thisFigure, and the positive control OLSs depicted in this Figure, werelater found to contain a deletion in the nucleotide sequence encodingthe OLS proteins, which led to the production of truncated proteins.Accordingly, all candidate OLS enzymes in this library, and the positivecontrols, were also tested independently in a new library containingonly full-length OLS sequences, described in Example 3.

FIG. 10 is a graph showing normalized OA versus olivetol production byS. cerevisiae strains expressing OLS candidate enzymes. Peak areasobtained via LC/MS quantification for olivetol and OA were normalized toan internal standard for olivetol or OA, respectively. The regressionline shown in FIG. 10 represents a 1:1 ratio of olivetol and olivetolicacid. Normalized peak areas were further normalized to positive controlstrain (t339582) contained on each plate. The strain t395094demonstrated significantly increased olivetol production compared to thepositive controls, while the strain t393974 showed enhanced productionof OA over the positive control strains. The enhanced production of OAover olivetol by t393974 suggests that this enzyme possessesbifunctional PKS-PKC activity. As explained in Example 2 and in Table 6,OLS candidate enzymes within the library depicted in this Figure, andthe positive control OLSs depicted in this Figure, were later found tocontain a deletion in the nucleotide sequence encoding the OLS proteins,which led to the production of truncated proteins. Accordingly, allcandidate OLS enzymes in this library, and the positive controls, werealso tested independently in a new library containing only full-lengthOLS sequences, described in Example 3.

FIGS. 11A-11H show a sequence alignment of olivetol synthases (OLSs). Analignment of t394911 (SEQ ID NO: 28), t393974 (SEQ ID NO: 6), t393720(SEQ ID NO: 27), t394336 (SEQ ID NO: 8), t393991 (SEQ ID NO: 7), t395011(SEQ ID NO: 15), t339568 (SEQ ID NO: 5), t339579 (SEQ ID NO: 30),t339582 (SEQ ID NO: 31), t394457 (SEQ ID NO: 10), t394521 (SEQ ID NO:11), t394436 (SEQ ID NO: 26), t395094 (SEQ ID NO: 17), t394087 (SEQ IDNO: 1), t395023 (SEQ ID NO: 29), t395103 (SEQ ID NO: 18), t394687 (SEQID NO: 2), t393835 (SEQ ID NO: 19), t394037 (SEQ ID NO: 22), t394905(SEQ ID NO: 13), t393563 (SEQ ID NO: 4), t394981 (SEQ ID NO: 14),t394790 (SEQ ID NO: 12), t394797 (SEQ ID NO: 16), t394091 (SEQ ID NO:21), t394043 (SEQ ID NO: 24), t394404 (SEQ ID NO: 25), t393495 (SEQ IDNO: 3), t394547 (SEQ ID NO: 9), t394115 (SEQ ID NO: 20), and t394279(SEQ ID NO: 23) is shown. The sequence alignment was conducted usingClustal Omega. See, e.g., Chojnacki et al., Nucleic Acids Res. 2017 Jul.3; 45(W1):W550-W553.

FIG. 12A-12B are graphs showing olivetol production (FIG. 12A) andolivetolic acid production (FIG. 12B) from S. cerevisiae strainsexpressing OLS candidate enzymes. Strains shown were determined to behits from the primary screen of the library of OLS candidates screenedin Example 3.

FIG. 13 is a graph showing olivetol production by S. cerevisiae strainsexpressing C. sativa OLS (CsOLS) point-mutant variants. Concentrationsof olivetol in g/L were determined via LC/MS quantification. The straint405417 (having a T335C point mutation relative to the CsOLS set forthin SEQ ID NO: 5) demonstrated the highest olivetol production. Errorbars represent the standard deviation of 4 independent measurements.

FIGS. 14A-14B are graphs showing olivetol production from S. cerevisiaestrains expressing single point mutation and multiple point mutationvariants based on a Cymbidium hybrid cultivar OLS template (ChOLS) (FIG.14A) and a Corchorus olitorius OLS (CoOLS) template (FIG. 14B). Strainsshown were screened in a secondary screen as described in Example 5.Olivetol titers were normalized to the mean olivetol titer produced bythe positive control strain t527346 (FIG. 14A), and t606797 (FIG. 14B).

FIG. 15 is a graph showing olivetol production by a prototrophic S.cerevisiae strain expressing candidate OLS enzymes. Concentrations ofolivetol in g/L were determined via LC/MS quantification. Performance ofOLS candidate enzymes exhibiting higher olivetol production than C.sativa OLS positive controls is shown. The strains t485662, t485672, andt496073 demonstrated comparable olivetol production to the CsOLS T335Cpoint-mutant positive control.

FIG. 16 is a three-dimensional homology model showing residues withinabout 8 angstroms of any of the residues within the catalytic triad ofthe C. sativa OLS comprising SEQ ID NO: 5 and/or within about 8angstroms of a docked substrate within the C. sativa OLS comprising SEQID NO: 5. Only residues at which an amino acid substitution resulted inproduction of at least 10 mg/L olivetol are shown with their electronclouds in light gray. The active site was defined to include a dockedmolecule of hexanoyl-CoA (OLS substrate) plus the catalytic triad. Thetop model was rotated 900 to produce the bottom model.

FIG. 17 is a three-dimensional homology model showing residues withinabout 12 angstroms of any of the residues within the catalytic triad ofthe C. sativa OLS comprising SEQ ID NO: 5 and/or within about 12angstroms of a docked substrate within the C. sativa OLS comprising SEQID NO: 5. Only residues at which an amino acid substitution resulted inproduction of at least 10 mg/L olivetol are shown with their electronclouds. The active site was defined to include a docked molecule ofhexanoyl-CoA (OLS substrate) plus the catalytic triad. The top model wasrotated 90° to produce the bottom model.

DETAILED DESCRIPTION

This disclosure provides methods for production of cannabinoids andcannabinoid precursors from fatty acid substrates using geneticallymodified host cells. Methods include heterologous expression of enzymesincluding acyl activating enzymes (AAE) and polyketide synthase enzymes(PKS) such as olivetol synthase enzymes (OLS). The disclosure describesidentification of AAE and OLS enzymes that can be functionally expressedin eukaryotic (e.g., S. cerevisiae) and prokaryotic (E. coli) host cellssuch as S. cerevisiae and E. coli. As demonstrated in Example 1, novelAAE enzymes were identified that are capable of using hexanoate andbutyrate as substrates to produce cannabinoid precursors. Asdemonstrated in Examples 2-3, novel OLS enzymes were identified that arecapable of producing olivetol and olivetolic acid. Examples 4-6 furtherdemonstrate enhanced production of olivetol and/or olivetolic acid byprotein engineering of OLS enzymes. The novel enzymes described in thisdisclosure may be useful in increasing the efficiency and purity ofcannabinoid production.

Definitions

While the following terms are believed to be well understood by one ofordinary skill in the art, the following definitions are set forth tofacilitate explanation of the disclosed subject matter.

The term “a” or “an” refers to one or more of an entity, i.e., canidentify a referent as plural. Thus, the terms “a” or “an,” “one ormore” and “at least one” are used interchangeably in this application.In addition, reference to “an element” by the indefinite article “a” or“an” does not exclude the possibility that more than one of the elementsis present, unless the context clearly requires that there is one andonly one of the elements.

The terms “microorganism” or “microbe” should be taken broadly. Theseterms are used interchangeably and include, but are not limited to, thetwo prokaryotic domains, Bacteria and Archaea, as well as certaineukaryotic fungi and protists. In some embodiments, the disclosure mayrefer to the “microorganisms” or “microbes” of lists/tables and figurespresent in the disclosure. This characterization can refer to not onlythe identified taxonomic genera of the tables and figures, but also theidentified taxonomic species, as well as the various novel and newlyidentified or designed strains of any organism in the tables or figures.The same characterization holds true for the recitation of these termsin other parts of the specification, such as in the Examples.

The term “prokaryotes” is recognized in the art and refers to cells thatcontain no nucleus or other cell organelles. The prokaryotes aregenerally classified in one of two domains, the Bacteria and theArchaea.

“Bacteria” or “eubacteria” refers to a domain of prokaryotic organisms.Bacteria include at least 11 distinct groups as follows: (1)Gram-positive (gram+) bacteria, of which there are two majorsubdivisions: (a) high G+C group (Actinomycetes, Mycobacteria,Micrococcus, others) and (b) low G+C group (Bacillus, Clostridia,Lactobacillus, Staphylococci, Streptococci, Mycoplasmas); (2)Proteobacteria, e.g., Purple photosynthetic+non-photosyntheticGram-negative bacteria (includes most “common” Gram-negative bacteria);(3) Cyanobacteria, e.g., oxygenic phototrophs; (4) Spirochetes andrelated species; (5) Planctomyces; (6) Bacteroides, Flavobacteria; (7)Chlamydia; (8) Green sulfur bacteria; (9) Green non-sulfur bacteria(also anaerobic phototrophs); (10) Radioresistant micrococci andrelatives; and (11) Thermotoga and Thermosipho thermophiles.

The term “Archaea” refers to a taxonomic classification of prokaryoticorganisms with certain properties that make them distinct from Bacteriain physiology and phylogeny.

The term “Cannabis” refers to a genus in the family Cannabaceae.Cannabis is a dioecious plant. Glandular structures located on femaleflowers of Cannabis, called trichomes, accumulate relatively highamounts of a class of terpeno-phenolic compounds known asphytocannabinoids (described in further detail below). Cannabis hasconventionally been cultivated for production of fibre and seed(commonly referred to as “hemp-type”), or for production of intoxicants(commonly referred to as “drug-type”). In drug-type Cannabis, thetrichomes contain relatively high amounts of tetrahydrocannabinolic acid(THCA), which can convert to tetrahydrocannabinol (THC) via adecarboxylation reaction, for example upon combustion of dried Cannabisflowers, to provide an intoxicating effect. Drug-type Cannabis oftencontains other cannabinoids in lesser amounts. In contrast, hemp-typeCannabis contains relatively low concentrations of THCA, often less than0.3% THC by dry weight, accounting for the ability of THCA to convert toTHC. Hemp-type Cannabis may contain non-THC and non-THCA cannabinoids,such as cannabidiolic acid (CBDA), cannabidiol (CBD), and othercannabinoids. Presently, there is a lack of consensus regarding thetaxonomic organization of the species within the genus. Unless contextdictates otherwise, the term “Cannabis” is intended to include allputative species within the genus, such as, without limitation, Cannabissativa, Cannabis indica, and Cannabis ruderalis and without regard towhether the Cannabis is hemp-type or drug-type.

The term “cyclase activity” in reference to a polyketide synthase (PKS)enzyme (e.g., an olivetol synthase (OLS) enzyme) or a polyketide cyclase(PKC) enzyme (e.g., an olivetolic acid cyclase (OAC) enzyme), refers tothe activity of catalyzing the cyclization of an oxo fatty acyl-CoA(e.g., 3,5,7-trioxododecanoyl-COA, 3,5,7-trioxodecanoyl-COA) to thecorresponding intramolecular cyclization product (e.g., olivetolic acid,divarinic acid). In some embodiments, the PKS catalyzes the C2-C7 aldolcondensation of an acyl-COA with three additional ketide moieties addedthereto.

A “cytosolic” or “soluble” enzyme refers to an enzyme that ispredominantly localized (or predicted to be localized) in the cytosol ofa host cell.

A “eukaryote” is any organism whose cells contain a nucleus and otherorganelles enclosed within membranes. Eukaryotes belong to the taxonEukarya or Eukaryota. The defining feature that sets eukaryotic cellsapart from prokaryotic cells (i.e., bacteria and archaea) is that theyhave membrane-bound organelles, especially the nucleus, which containsthe genetic material, and is enclosed by the nuclear envelope.

The term “host cell” refers to a cell that can be used to express apolynucleotide, such as a polynucleotide that encodes an enzyme used inbiosynthesis of cannabinoids or cannabinoid precursors. The terms“genetically modified host cell,” “recombinant host cell,” and“recombinant strain” are used interchangeably and refer to host cellsthat have been genetically modified by, e.g., cloning and transformationmethods, or by other methods known in the art (e.g., selective editingmethods, such as CRISPR). Thus, the terms include a host cell (e.g.,bacterial cell, yeast cell, fungal cell, insect cell, plant cell,mammalian cell, human cell, etc.) that has been genetically altered,modified, or engineered, so that it exhibits an altered, modified, ordifferent genotype and/or phenotype, as compared to thenaturally-occurring cell from which it was derived. It is understoodthat in some embodiments, the terms refer not only to the particularrecombinant host cell in question, but also to the progeny or potentialprogeny of such a host cell.

The term “control host cell,” or the term “control” when used inrelation to a host cell, refers to an appropriate comparator host cellfor determining the effect of a genetic modification or experimentaltreatment. In some embodiments, the control host cell is a wild typecell. In other embodiments, a control host cell is genetically identicalto the genetically modified host cell, except for the geneticmodification(s) differentiating the genetically modified or experimentaltreatment host cell. In some embodiments, the control host cell has beengenetically modified to express a wild type or otherwise known variantof an enzyme being tested for activity in other test host cells.

The term “heterologous” with respect to a polynucleotide, such as apolynucleotide comprising a gene, is used interchangeably with the term“exogenous” and the term “recombinant” and refers to a polynucleotidethat has been artificially supplied to a biological system, apolynucleotide that has been modified within a biological system, or apolynucleotide whose expression or regulation has been manipulatedwithin a biological system. A heterologous polynucleotide that isintroduced into or expressed in a host cell may be a polynucleotide thatcomes from a different organism or species than the host cell, or may bea synthetic polynucleotide, or may be a polynucleotide that is alsoendogenously expressed in the same organism or species as the host cell.For example, a polynucleotide that is endogenously expressed in a hostcell may be considered heterologous when it is situated non-naturally inthe host cell; expressed recombinantly in the host cell, either stablyor transiently; modified within the host cell; selectively edited withinthe host cell; expressed in a copy number that differs from thenaturally occurring copy number within the host cell; or expressed in anon-natural way within the host cell, such as by manipulating regulatoryregions that control expression of the polynucleotide. In someembodiments, a heterologous polynucleotide is a polynucleotide that isendogenously expressed in a host cell but whose expression is driven bya promoter that does not naturally regulate expression of thepolynucleotide. In other embodiments, a heterologous polynucleotide is apolynucleotide that is endogenously expressed in a host cell and whoseexpression is driven by a promoter that does naturally regulateexpression of the polynucleotide, but the promoter or another regulatoryregion is modified. In some embodiments, the promoter is recombinantlyactivated or repressed. For example, gene-editing based techniques maybe used to regulate expression of a polynucleotide, including anendogenous polynucleotide, from a promoter, including an endogenouspromoter. See, e.g., Chavez et al., Nat Methods. 2016 July; 13(7):563-567. A heterologous polynucleotide may comprise a wild-type sequenceor a mutant sequence as compared with a reference polynucleotidesequence.

The term “at least a portion” or “at least a fragment” of a nucleic acidor polypeptide means a portion having the minimal size characteristicsof such sequences, or any larger fragment of the full length molecule,up to and including the full length molecule. A fragment of apolynucleotide of the disclosure may encode a biologically activeportion of an enzyme, such as a catalytic domain. A biologically activeportion of a genetic regulatory element may comprise a portion orfragment of a full length genetic regulatory element and have the sametype of activity as the full length genetic regulatory element, althoughthe level of activity of the biologically active portion of the geneticregulatory element may vary compared to the level of activity of thefull length genetic regulatory element.

A coding sequence and a regulatory sequence are said to be “operablyjoined” or “operably linked” when the coding sequence and the regulatorysequence are covalently linked and the expression or transcription ofthe coding sequence is under the influence or control of the regulatorysequence. If the coding sequence is to be translated into a functionalprotein, the coding sequence and the regulatory sequence are said to beoperably joined if induction of a promoter in the 5′ regulatory sequenceresults in transcription of the coding sequence and if the nature of thelinkage between the coding sequence and the regulatory sequence does not(1) result in the introduction of a frame-shift mutation, (2) interferewith the ability of the promoter region to direct the transcription ofthe coding sequence, or (3) interfere with the ability of thecorresponding RNA transcript to be translated into a protein.

The term “volumetric productivity” or “production rate” refers to theamount of product formed per volume of medium per unit of time.Volumetric productivity can be reported in gram per liter per hour(g/L/h).

The term “specific productivity” of a product refers to the rate offormation of the product normalized by unit volume or mass or biomassand has the physical dimension of a quantity of substance per unit timeper unit mass or volume [M·T⁻¹·M⁻¹ or M·T⁻¹·L⁻³, where M is mass ormoles, T is time, L is length].

The term “biomass specific productivity” refers to the specificproductivity in gram product per gram of cell dry weight (CDW) per hour(g/g CDW/h) or in mmol of product per gram of cell dry weight (CDW) perhour (mmol/g CDW/h). Using the relation of CDW to OD600 for the givenmicroorganism, specific productivity can also be expressed as gramproduct per liter culture medium per optical density of the culturebroth at 600 nm (OD) per hour (g/L/h/OD). Also, if the elementalcomposition of the biomass is known, biomass specific productivity canbe expressed in mmol of product per C-mole (carbon mole) of biomass perhour (mmol/C-mol/h).

The term “yield” refers to the amount of product obtained per unitweight of a certain substrate and may be expressed as g product per gsubstrate (g/g) or moles of product per mole of substrate (mol/mol).Yield may also be expressed as a percentage of the theoretical yield.“Theoretical yield” is defined as the maximum amount of product that canbe generated per a given amount of substrate as dictated by thestoichiometry of the metabolic pathway used to make the product and maybe expressed as g product per g substrate (g/g) or moles of product permole of substrate (mol/mol).

The term “titer” refers to the strength of a solution or theconcentration of a substance in solution. For example, the titer of aproduct of interest (e.g., small molecule, peptide, synthetic compound,fuel, alcohol, etc.) in a fermentation broth is described as g ofproduct of interest in solution per liter of fermentation broth orcell-free broth (g/L) or as g of product of interest in solution per kgof fermentation broth or cell-free broth (g/Kg).

The term “total titer” refers to the sum of all products of interestproduced in a process, including but not limited to the products ofinterest in solution, the products of interest in gas phase ifapplicable, and any products of interest removed from the process andrecovered relative to the initial volume in the process or the operatingvolume in the process. For example, the total titer of products ofinterest (e.g., small molecule, peptide, synthetic compound, fuel,alcohol, etc.) in a fermentation broth is described as g of products ofinterest in solution per liter of fermentation broth or cell-free broth(g/L) or as g of products of interest in solution per kg of fermentationbroth or cell-free broth (g/Kg).

The term “amino acid” refers to organic compounds that comprise an aminogroup, —NH2, and a carboxyl group, —COOH. The term “amino acid” includesboth naturally occurring and unnatural amino acids. Nomenclature for thetwenty common amino acids is as follows: alanine (ala or A); arginine(arg or R); asparagine (asn or N); aspartic acid (asp or D); cysteine(cys or C); glutamine (gln or Q); glutamic acid (glu or E); glycine (glyor G); histidine (his or H); isoleucine (ile or I); leucine (leu or L);lysine (lys or K); methionine (met or M); phenylalanine (phe or F);proline (pro or P); serine (ser or S); threonine (thr or T); tryptophan(trp or W); tyrosine (tyr or Y); and valine (val or V). Non-limitingexamples of unnatural amino acids include homo-amino acids, proline andpyruvic acid derivatives, 3-substituted alanine derivatives, glycinederivatives, ring-substituted phenylalanine derivatives,ring-substituted tyrosine derivatives, linear core amino acids, aminoacids with protecting groups including Fmoc, Boc, and Cbz, β-amino acids(β3 and β2), and N-methyl amino acids.

The term “aliphatic” refers to alkyl, alkenyl, alkynyl, and carbocyclicgroups. Likewise, the term “heteroaliphatic” refers to heteroalkyl,heteroalkenyl, heteroalkynyl, and heterocyclic groups.

The term “alkyl” refers to a radical of, or a substituent that is, astraight-chain or branched saturated hydrocarbon group having from 1 to20 carbon atoms (“C1-20 alkyl”). In certain embodiments, the term“alkyl” refers to a radical of, or a substituent that is, astraight-chain or branched saturated hydrocarbon group having from 1 to10 carbon atoms (“C₁₋₁₀ alkyl”). In some embodiments, an alkyl group has1 to 9 carbon atoms (“C₁₋₉ alkyl”). In some embodiments, an alkyl grouphas 1 to 8 carbon atoms (“C₁₋₈ alkyl”). In some embodiments, an alkylgroup has 1 to 7 carbon atoms (“C₁₋₇ alkyl”). In some embodiments, analkyl group has 1 to 6 carbon atoms (“C₁₋₆ alkyl”). In some embodiments,an alkyl group has 2 to 6 carbon atoms (“C₂₋₆ alkyl”). In someembodiments, an alkyl group has 3 to 5 carbon atoms (“C₃₋₅ alkyl”). Insome embodiments, an alkyl group has 5 carbon atoms (“C₅ alkyl”). Insome embodiments, the alkyl group has 3 carbon atoms (“C3 alkyl”). Insome embodiments, an alkyl group has 1 to 5 carbon atoms (“C₁₋₅ alkyl”).In some embodiments, an alkyl group has 1 to 4 carbon atoms (“C₁₋₄alkyl”). In some embodiments, an alkyl group has 1 to 3 carbon atoms(“C₁₋₃ alkyl”). In some embodiments, an alkyl group has 1 to 2 carbonatoms (“C₁₋₂ alkyl”). In some embodiments, an alkyl group has 1 carbonatom (“C₁ alkyl”).

Examples of C₁₋₆ alkyl groups include methyl (C₁), ethyl (C₂), propyl(C₃) (e.g., n-propyl, isopropyl), butyl (C₄) (e.g., n-butyl, tert-butyl,sec-butyl, iso-butyl), pentyl (C₅) (e.g., n-pentyl, 3-pentanyl, amyl,neopentyl, 3-methyl-2-butanyl, tertiary amyl), and hexyl (C₆) (e.g.,n-hexyl). Additional examples of alkyl groups include n-heptyl (C₇),n-octyl (C₈), and the like. Unless otherwise specified, each instance ofan alkyl group is independently unsubstituted (an “unsubstituted alkyl”)or substituted (a “substituted alkyl”) with one or more substituents(e.g., halogen, such as F). In certain embodiments, the alkyl group isan unsubstituted C₁₋₁₀ alkyl (such as unsubstituted C₁₋₆ alkyl, e.g.,—CH₃ (Me), unsubstituted ethyl (Et), unsubstituted propyl (Pr, e.g.,unsubstituted n-propyl (n-Pr), unsubstituted isopropyl (i-Pr)),unsubstituted butyl (Bu, e.g., unsubstituted n-butyl (n-Bu),unsubstituted tert-butyl (tert-Bu or t-Bu), unsubstituted sec-butyl(sec-Bu), unsubstituted isobutyl (i-Bu)). In certain embodiments, thealkyl group is a substituted C₁₋₁₀ alkyl (such as substituted C₁₋₆alkyl, e.g., —CF₃, benzyl).

The term “acyl” refers to a group having the general formula —C(═O)Rx,—C(═O)OR^(X1), —C(═O)—O—C(═O)R^(X1), —C(═O)SR^(X1), —C(═O)N(R^(X1))₂,—C(═S)R^(X1), —C(═S)N(R^(X1))₂, and —C(═S)S(R^(X1)), —C(═NR^(X1))R^(X1),—C(═NR^(X1))OR, —C(═NR^(X1))SR^(X1), and —C(═NR^(X1))N(R^(X1))₂, whereinR^(X1) is hydrogen; halogen; substituted or unsubstituted hydroxyl;substituted or unsubstituted thiol; substituted or unsubstituted amino;substituted or unsubstituted acyl, cyclic or acyclic, substituted orunsubstituted, branched or unbranched aliphatic; cyclic or acyclic,substituted or unsubstituted, branched or unbranched heteroaliphatic;cyclic or acyclic, substituted or unsubstituted, branched or unbranchedalkyl; cyclic or acyclic, substituted or unsubstituted, branched orunbranched alkenyl; substituted or unsubstituted alkynyl; substituted orunsubstituted aryl, substituted or unsubstituted heteroaryl,aliphaticoxy, heteroaliphaticoxy, alkyloxy, heteroalkyloxy, aryloxy,heteroaryloxy, aliphaticthioxy, heteroaliphaticthioxy, alkylthioxy,heteroalkylthioxy, arylthioxy, heteroarylthioxy, mono- ordi-aliphaticamino, mono- or di-heteroaliphaticamino, mono- ordi-alkylamino, mono- or di-heteroalkylamino, mono- or di-arylamino, ormono- or di-heteroarylamino; or two R^(X1) groups taken together form a5- to 6-membered heterocyclic ring. Exemplary acyl groups includealdehydes (—CHO), carboxylic acids (—CO₂H), ketones, acyl halides,esters, amides, imines, carbonates, carbamates, and ureas. Acylsubstituents include, but are not limited to, any of the substituentsdescribed in this application that result in the formation of a stablemoiety (e.g., aliphatic, alkyl, alkenyl, alkynyl, heteroaliphatic,heterocyclic, aryl, heteroaryl, acyl, oxo, imino, thiooxo, cyano,isocyano, amino, azido, nitro, hydroxyl, thiol, halo, aliphaticamino,heteroaliphaticamino, alkylamino, heteroalkylamino, arylamino,heteroarylamino, alkylaryl, arylalkyl, aliphaticoxy, heteroaliphaticoxy,alkyloxy, heteroalkyloxy, aryloxy, heteroaryloxy, aliphaticthioxy,heteroaliphaticthioxy, alkylthioxy, heteroalkylthioxy, arylthioxy,heteroarylthioxy, acyloxy, and the like, each of which may or may not befurther substituted).

“Alkenyl” refers to a radical of, or a substituent that is, astraight-chain or branched hydrocarbon group having from 2 to 20 carbonatoms, one or more carbon-carbon double bonds, and no triple bonds(“C₂₋₂₀ alkenyl”). In some embodiments, an alkenyl group has 2 to 10carbon atoms (“C₂₋₁₀ alkenyl”). In some embodiments, an alkenyl grouphas 2 to 9 carbon atoms (“C₂₋₉ alkenyl”). In some embodiments, analkenyl group has 2 to 8 carbon atoms (“C₂₋₈ alkenyl”). In someembodiments, an alkenyl group has 2 to 7 carbon atoms (“C₂₋₇ alkenyl”).In some embodiments, an alkenyl group has 2 to 6 carbon atoms (“C₂₋₆alkenyl”). In some embodiments, an alkenyl group has 2 to 5 carbon atoms(“C₂₋₅ alkenyl”). In some embodiments, an alkenyl group has 2 to 4carbon atoms (“C₂₋₄ alkenyl”). In some embodiments, an alkenyl group has2 to 3 carbon atoms (“C₂₋₃ alkenyl”). In some embodiments, an alkenylgroup has 2 carbon atoms (“C₂ alkenyl”). The one or more carbon-carbondouble bonds can be internal (such as in 2-butenyl) or terminal (such asin 1-butenyl). Examples of C₂₋₄ alkenyl groups include ethenyl (C₂),1-propenyl (C₃), 2-propenyl (C₃), 1-butenyl (C₄), 2-butenyl (C₄),butadienyl (C₄), and the like. Examples of C₂₋₆ alkenyl groups includethe aforementioned C₂₋₄ alkenyl groups as well as pentenyl (C₅),pentadienyl (C₅), hexenyl (C₆), and the like. Additional examples ofalkenyl include heptenyl (C₇), octenyl (C₈), octatrienyl (C₈), and thelike. Unless otherwise specified, each instance of an alkenyl group isindependently optionally substituted, i.e., unsubstituted (an“unsubstituted alkenyl”) or substituted (a “substituted alkenyl”) withone or more substituents. In certain embodiments, the alkenyl group isunsubstituted C₂₋₁₀ alkenyl. In certain embodiments, the alkenyl groupis substituted C₂₋₁₀ alkenyl.

“Alkynyl” refers to a radical of, or a substituent that is, astraight-chain or branched hydrocarbon group having from 2 to 20 carbonatoms, one or more carbon-carbon triple bonds, and optionally one ormore double bonds (“C₂₋₂₀ alkynyl”). In some embodiments, an alkynylgroup has 2 to 10 carbon atoms (“C₂₋₁₀ alkynyl”). In some embodiments,an alkynyl group has 2 to 9 carbon atoms (“C₂₋₉ alkynyl”). In someembodiments, an alkynyl group has 2 to 8 carbon atoms (“C₂₋₈ alkynyl”).In some embodiments, an alkynyl group has 2 to 7 carbon atoms (“C₂₋₇alkynyl”). In some embodiments, an alkynyl group has 2 to 6 carbon atoms(“C₂₋₆ alkynyl”). In some embodiments, an alkynyl group has 2 to 5carbon atoms (“C₂₋₅ alkynyl”). In some embodiments, an alkynyl group has2 to 4 carbon atoms (“C₂₋₄ alkynyl”). In some embodiments, an alkynylgroup has 2 to 3 carbon atoms (“C₂₋₃ alkynyl”). In some embodiments, analkynyl group has 2 carbon atoms (“C₂ alkynyl”). The one or morecarbon-carbon triple bonds can be internal (such as in 2-butynyl) orterminal (such as in 1-butynyl). Examples of C₂₋₄ alkynyl groupsinclude, without limitation, ethynyl (C₂), 1-propynyl (C₃), 2-propynyl(C₃), 1-butynyl (C₄), 2-butynyl (C₄), and the like. Examples of C₂₋₆alkenyl groups include the aforementioned C₂₋₄ alkynyl groups as well aspentynyl (C₅), hexynyl (C₆), and the like. Additional examples ofalkynyl include heptynyl (C₇), octynyl (C₈), and the like. Unlessotherwise specified, each instance of an alkynyl group is independentlyoptionally substituted, i.e., unsubstituted (an “unsubstituted alkynyl”)or substituted (a “substituted alkynyl”) with one or more substituents.In certain embodiments, the alkynyl group is unsubstituted C₂₋₁₀alkynyl. In certain embodiments, the alkynyl group is substituted C₂₋₁₀alkynyl.

“Carbocyclyl” or “carbocyclic” refers to a radical of a non-aromaticcyclic hydrocarbon group having from 3 to 10 ring carbon atoms (“C₃₋₁₀carbocyclyl”) and zero heteroatoms in the non-aromatic ring system. Insome embodiments, a carbocyclyl group has 3 to 8 ring carbon atoms(“C₃₋₈ carbocyclyl”). In some embodiments, a carbocyclyl group has 3 to6 ring carbon atoms (“C₃₋₆ carbocyclyl”). In some embodiments, acarbocyclyl group has 3 to 6 ring carbon atoms (“C₃₋₆ carbocyclyl”). Insome embodiments, a carbocyclyl group has 5 to 10 ring carbon atoms(“C₅₋₁₀ carbocyclyl”). Exemplary C₃₋₆ carbocyclyl groups include,without limitation, cyclopropyl (C₃), cyclopropenyl (C₃), cyclobutyl(C₄), cyclobutenyl (C₄), cyclopentyl (C₅), cyclopentenyl (C₅),cyclohexyl (C₆), cyclohexenyl (C₆), cyclohexadienyl (C₆), and the like.Exemplary C₃₋₈ carbocyclyl groups include, without limitation, theaforementioned C₃₋₆ carbocyclyl groups as well as cycloheptyl (C₇),cycloheptenyl (C₇), cycloheptadienyl (C₇), cycloheptatrienyl (C₇),cyclooctyl (C₈), cyclooctenyl (C₈), bicyclo[2.2.1]heptanyl (C₇),bicyclo[2.2.2]octanyl (C₈), and the like. Exemplary C₃₋₁₀ carbocyclylgroups include, without limitation, the aforementioned C₃₋₈ carbocyclylgroups as well as cyclononyl (C₉), cyclononenyl (C₉), cyclodecyl (C₁₀),cyclodecenyl (C₁₀), octahydro-1H-indenyl (C₉), decahydronaphthalenyl(C₁₀), spiro[4.5]decanyl (C₁₀), and the like. As the foregoing examplesillustrate, in certain embodiments, the carbocyclyl group is eithermonocyclic (“monocyclic carbocyclyl”) or contain a fused, bridged orspiro ring system such as a bicyclic system (“bicyclic carbocyclyl”) andcan be saturated or can be partially unsaturated. “Carbocyclyl” alsoincludes ring systems wherein the carbocyclic ring, as defined above, isfused with one or more aryl or heteroaryl groups wherein the point ofattachment is on the carbocyclic ring, and in such instances, the numberof carbons continue to designate the number of carbons in thecarbocyclic ring system. Unless otherwise specified, each instance of acarbocyclyl group is independently optionally substituted, i.e.,unsubstituted (an “unsubstituted carbocyclyl”) or substituted (a“substituted carbocyclyl”) with one or more substituents. In certainembodiments, the carbocyclyl group is unsubstituted C₃₋₁₀ carbocyclyl.In certain embodiments, the carbocyclyl group is a substituted C₃₋₁₀carbocyclyl.

In some embodiments, “carbocyclyl” is a monocyclic, saturatedcarbocyclyl group having from 3 to 10 ring carbon atoms (“C₃₋₁₀cycloalkyl”). In some embodiments, a cycloalkyl group has 3 to 8 ringcarbon atoms (“C₃₋₈ cycloalkyl”). In some embodiments, a cycloalkylgroup has 3 to 6 ring carbon atoms (“C₃₋₆ cycloalkyl”). In someembodiments, a cycloalkyl group has 5 to 6 ring carbon atoms (“C₅₋₆cycloalkyl”). In some embodiments, a cycloalkyl group has 5 to 10 ringcarbon atoms (“C₅₋₁₀ cycloalkyl”). Examples of C₅₋₆ cycloalkyl groupsinclude cyclopentyl (C₅) and cyclohexyl (C₅). Examples of C₃₋₆cycloalkyl groups include the aforementioned C₅₋₆ cycloalkyl groups aswell as cyclopropyl (C₃) and cyclobutyl (C₄). Examples of C₃₋₈cycloalkyl groups include the aforementioned C₃₋₆ cycloalkyl groups aswell as cycloheptyl (C₇) and cyclooctyl (C₈). Unless otherwisespecified, each instance of a cycloalkyl group is independentlyunsubstituted (an “unsubstituted cycloalkyl”) or substituted (a“substituted cycloalkyl”) with one or more substituents. In certainembodiments, the cycloalkyl group is unsubstituted C₃₋₁₀ cycloalkyl. Incertain embodiments, the cycloalkyl group is substituted C₃₋₁₀cycloalkyl.

“Aryl” refers to a radical of a monocyclic or polycyclic (e.g., bicyclicor tricyclic) 4n+2 aromatic ring system (e.g., having 6, 10, or 14 pielectrons shared in a cyclic array) having 6-14 ring carbon atoms andzero heteroatoms provided in the aromatic ring system (“C₆₋₁₄ aryl”). Insome embodiments, an aryl group has six ring carbon atoms (“C₆ aryl”;e.g., phenyl). In some embodiments, an aryl group has ten ring carbonatoms (“C₁₀ aryl”; e.g., naphthyl such as 1-naphthyl and 2-naphthyl). Insome embodiments, an aryl group has fourteen ring carbon atoms (“C₁₄aryl”; e.g., anthracyl). “Aryl” also includes ring systems wherein thearyl ring, as defined above, is fused with one or more carbocyclyl orheterocyclyl groups wherein the radical or point of attachment is on thearyl ring, and in such instances, the number of carbon atoms continue todesignate the number of carbon atoms in the aryl ring system. Unlessotherwise specified, each instance of an aryl group is independentlyoptionally substituted, i.e., unsubstituted (an “unsubstituted aryl”) orsubstituted (a “substituted aryl”) with one or more substituents. Incertain embodiments, the aryl group is unsubstituted C₆₋₁₄ aryl. Incertain embodiments, the aryl group is substituted C₆₋₁₄ aryl.

“Aralkyl” is a subset of alkyl and aryl and refers to an optionallysubstituted alkyl group substituted by an optionally substituted arylgroup. In certain embodiments, the aralkyl is optionally substitutedbenzyl. In certain embodiments, the aralkyl is benzyl. In certainembodiments, the aralkyl is optionally substituted phenethyl. In certainembodiments, the aralkyl is phenethyl. In certain embodiments, thearalkyl is 7-phenylheptanyl. In certain embodiments, the aralkyl is C₇alkyl substituted by an optionally substituted aryl group (e.g.,phenyl). In certain embodiments, the aralkyl is a C7-C10 alkyl groupsubstituted by an optionally substituted aryl group (e.g., phenyl).

“Partially unsaturated” refers to a group that includes at least onedouble or triple bond. A “partially unsaturated” ring system is furtherintended to encompass rings having multiple sites of unsaturation but isnot intended to include aromatic groups (e.g., aryl or heteroarylgroups) as defined in this application. Likewise, “saturated” refers toa group that does not contain a double or triple bond, i.e., containsall single bonds.

The term “optionally substituted” means substituted or unsubstituted.

Alkyl, alkenyl, alkynyl, carbocyclyl, heterocyclyl, aryl, and heteroarylgroups are optionally substituted (e.g., “substituted” or“unsubstituted” alkyl, “substituted” or “unsubstituted” alkenyl,“substituted” or “unsubstituted” alkynyl, “substituted” or“unsubstituted” carbocyclyl, “substituted” or “unsubstituted”heterocyclyl, “substituted” or “unsubstituted” aryl or “substituted” or“unsubstituted” heteroaryl group). In general, the term “substituted,”whether preceded by the term “optionally” or not, means that at leastone hydrogen present on a group (e.g., a carbon or nitrogen atom) isreplaced with a permissible substituent, e.g., a substituent which uponsubstitution results in a stable compound, e.g., a compound which doesnot spontaneously undergo transformation such as by rearrangement,cyclization, elimination, or other reaction. Unless otherwise indicated,a “substituted” group has a substituent at one or more substitutablepositions of the group, and when more than one position in any givenstructure is substituted, the substituent is either the same ordifferent at each position. The term “substituted” is contemplated toinclude substitution with all permissible substituents of organiccompounds, any of the substituents described in this application thatresults in the formation of a stable compound. The present inventioncontemplates any and all such combinations in order to arrive at astable compound. For purposes of this invention, heteroatoms such asnitrogen may have hydrogen substituents and/or any suitable substituentas described in this application which satisfy the valencies of theheteroatoms and results in the formation of a stable moiety.

Exemplary carbon atom substituents include, but are not limited to,halogen, —CN, —NO₂, —N₃, —SO₂H, —SO₃H, —OH, —OR^(aa), —ON(R^(bb))₂,—N(R^(bb))₂, —N(R^(bb))₃ ⁺X⁻, —N(OR^(cc))R^(bb), —SH, —SR^(aa),—SSR^(cc), —C(═O)R^(aa), —CO₂H, —CHO, —C(OR^(cc))₂, —CO₂R^(aa),—OC(═O)R^(aa), —OCO₂R^(aa), —C(═O)N(R^(bb))₂, —OC(═O)N(R^(bb))₂,—NR^(bb)C(═O)R^(aa), —NR^(bb)CO₂R^(aa), —N^(bb)C(═O)N(R^(bb))₂,—C(═NR^(bb))R^(aa), —C(═NR^(bb))OR^(aa), —OC(═NR^(bb))R^(aa),—OC(═NR^(bb))OR^(aa), —C(═NR^(bb))N(R^(bb))₂, —OC(═NR^(bb))N(R^(bb))₂,—NR^(bb)C(═NR^(bb))N(R^(bb))₂, —C(═O)NR^(bb)SO₂R^(aa),—NR^(bb)SO₂R^(aa), SO₂N(R^(bb))₂, —SO₂R^(aa), —SO₂OR^(aa), —OSO₂R^(aa),—S(═O)R^(aa), —OS(═O)R^(aa), —Si(R^(aa))₃,—OSi(R^(aa))₃—C(═S)N(R^(bb))₂, —C(═O)SR^(aa), —C(═S)SR^(aa),—SC(═S)SR^(aa), —SC(═O)SR^(aa), —OC(═O)SR^(aa), —SC(═O)OR^(aa),—SC(═O)R^(aa), —P(═O)(R^(aa))₂, —P(═O)(OR^(cc))₂, —OP(═O)(R^(aa))₂,OP(═O)(OR^(cc))₂, —P(═O)(N(R^(bb))₂)₂, —OP(═O)(N(R^(bb))₂)₂,—NR^(bb)P(═O)(R^(aa))₂, —NR^(bb)P(═O)(OR^(cc))₂,—NR^(bb)P(═O)(N(R^(bb))₂)₂, —P(R^(cc))₂, —P(OR^(cc))₂, —P(R^(cc))₃ ⁺X⁻,—P(OR^(cc))₃ ⁺X⁻, —P(R^(cc))₄, —P(OR^(cc))₄, —OP(R^(cc))₂, —OP(R^(cc))₃⁺X⁻, —OP(OR^(cc))₂, —OP(OR^(cc))₃ ⁺X⁻, —OP(R^(cc))₄, —OP(OR^(cc))₄,—B(R^(aa))₂, —B(OR^(cc))₂, —BR^(aa)(OR^(cc)), C₁₋₁₀ alkyl, C₁₋₁₀perhaloalkyl, C₂₋₁₀ alkenyl, C₂₋₁₀ alkynyl, heteroC₁₋₁₀ alkyl,heteroC₂₋₁₀ alkenyl, heteroC₂₋₁₀ alkynyl, C₃₋₁₀ carbocyclyl, 3-14membered heterocyclyl, C₆₋₁₄ aryl, and 5-14 membered heteroaryl;

wherein:

each instance of R^(aa) is, independently, selected from C₁₋₁₀ alkyl,C₁₋₁₀ perhaloalkyl, C₂₋₁₀ alkenyl, C₂₋₁₀ alkynyl, heteroC₁₋₁₀ alkyl,heteroC₂₋₁₀alkenyl, heteroC₂₋₁₀alkynyl, C₃₋₁₀ carbocyclyl, 3-14 memberedheterocyclyl, C₆₋₁₄ aryl, and 5-14 membered heteroaryl, or two R^(aa)groups are joined to form a 3-14 membered heterocyclyl or 5-14 memberedheteroaryl ring, wherein each alkyl, alkenyl, alkynyl, heteroalkyl,heteroalkenyl, heteroalkynyl, carbocyclyl, heterocyclyl, aryl, andheteroaryl is independently substituted with 0, 1, 2, 3, 4, or 5 R^(dd)groups; each instance of R^(bb) is, independently, selected fromhydrogen, —OH, —OR^(aa), —N(R^(cc))₂, —CN, —C(═O)R^(aa),—C(═O)N(R^(cc))₂, —CO₂R^(aa), —SO₂R^(aa), —C(═NR^(cc))OR^(aa),—C(═NR^(cc))N(R^(cc))₂, —SO₂N(R^(cc))₂, —SO₂R^(cc), —SO₂OR^(cc),—SOR^(aa), —C(═S)N(R^(cc))₂, —C(═O)SR^(cc), —C(═S)SR^(cc),—P(═O)(R^(aa))₂, —P(═O)(OR^(cc))₂, —P(═O)(N(R^(cc))₂)₂, C₁₋₁₀ alkyl,C1-10 perhaloalkyl, C₂₋₁₀ alkenyl, C₂₋₁₀ alkynyl, heteroC₁₋₁₀alkyl,heteroC₂₋₁₀alkenyl, heteroC₂₋₁₀alkynyl, C₃₋₁₀ carbocyclyl, 3-14 memberedheterocyclyl, C₆₋₁₄ aryl, and 5-14 membered heteroaryl, or two R^(bb)groups are joined to form a 3-14 membered heterocyclyl or 5-14 memberedheteroaryl ring, wherein each alkyl, alkenyl, alkynyl, heteroalkyl,heteroalkenyl, heteroalkynyl, carbocyclyl, heterocyclyl, aryl, andheteroaryl is independently substituted with 0, 1, 2, 3, 4, or 5 R^(dd)groups; wherein X⁻ is a counterion;

each instance of R^(cc) is, independently, selected from hydrogen, C₁₋₁₀alkyl, C₁₋₁₀ perhaloalkyl, C₂₋₁₀ alkenyl, C₂₋₁₀ alkynyl, heteroC₁₋₁₀alkyl, heteroC₂₋₁₀ alkenyl, heteroC₂₋₁₀ alkynyl, C₃₋₁₀ carbocyclyl, 3-14membered heterocyclyl, C₆₋₁₄ aryl, and 5-14 membered heteroaryl, or twoR^(cc) groups are joined to form a 3-14 membered heterocyclyl or 5-14membered heteroaryl ring, wherein each alkyl, alkenyl, alkynyl,heteroalkyl, heteroalkenyl, heteroalkynyl, carbocyclyl, heterocyclyl,aryl, and heteroaryl is independently substituted with 0, 1, 2, 3, 4, or5 R^(dd) groups;

each instance of R^(dd) is, independently, selected from halogen, —CN,—NO₂, —N₃, —SO₂H, —SO₃H, —OH, —OR^(ee), —ON(R^(ff))₂, —N(R^(ff))₂,—N(R^(ff))₃ ⁺X⁻, —N(OR^(ee))R^(ff), —SH, —SR^(ee), —SSR^(ee),—C(═O)R^(ee), —CO₂H, —CO₂R^(ee), —OC(═O)R^(ee), —OCO₂R^(ee),—C(═O)N(R^(ff))₂, —OC(═O)N(R^(ff))₂, —NR^(ff)C(═O)R^(ee),—NR^(ff)CO₂R^(ee), —NR^(ff)C(═O)N(R^(ff))₂, —C(═NR^(ff))OR^(ee),—OC(═NR^(ff))R^(ee), —OC(═NR^(ff))OR^(ee), —C(═NR^(ff))N(R^(ff))₂,—OC(═NR^(ff))N(R^(ff))₂, —NR^(ff)C(═NR^(ff))N(R^(ff))₂,—NR^(ff)SO₂R^(ee), —SO₂N(R^(ff))₂, —SO₂R^(ee), —SO₂OR^(ee), —OSO₂R^(ee),—S(═O)R^(ee), —Si(R^(ee))₃, —OSi(R^(ee))₃, —C(═S)N(R^(ff))₂,—C(═O)SR^(ee), —C(═S)SR^(ee), —SC(═S)SR^(ee), —P(═O)(OR^(ee))₂,—P(═O)(R^(ee))₂, —OP(═O)(R^(ee))₂, —OP(═O)(OR^(ee))₂, C₁₋₆ alkyl, C₁₋₆perhaloalkyl, C₂₋₆ alkenyl, C₂₋₆ alkynyl, heteroC₁₋₆alkyl,heteroC₂₋₆alkenyl, heteroC₂₋₆alkynyl, C₃₋₁₀ carbocyclyl, 3-10 memberedheterocyclyl, C₆₋₁₀ aryl, 5-10 membered heteroaryl, wherein each alkyl,alkenyl, alkynyl, heteroalkyl, heteroalkenyl, heteroalkynyl,carbocyclyl, heterocyclyl, aryl, and heteroaryl is independentlysubstituted with 0, 1, 2, 3, 4, or 5 R^(gg) groups, or two geminalR^(dd) substituents can be joined to form ═O or ═S; wherein X⁻ is acounterion;

each instance of R^(ee) is, independently, selected from C₁₋₆ alkyl,C₁₋₆ perhaloalkyl, C₂₋₆ alkenyl, C₂₋₆ alkynyl, heteroC₁₋₆ alkyl,heteroC₂₋₆alkenyl, heteroC₂₋₆ alkynyl, C₃₋₁₀ carbocyclyl, C₆₋₁₀ aryl,3-10 membered heterocyclyl, and 3-10 membered heteroaryl, wherein eachalkyl, alkenyl, alkynyl, heteroalkyl, heteroalkenyl, heteroalkynyl,carbocyclyl, heterocyclyl, aryl, and heteroaryl is independentlysubstituted with 0, 1, 2, 3, 4, or 5 R^(gg) groups;

each instance of R^(ff) is, independently, selected from hydrogen, C₁₋₆alkyl, C₁₋₆ perhaloalkyl, C₂₋₆ alkenyl, C₂₋₆ alkynyl, heteroC₁₋₆alkyl,heteroC₂₋₆alkenyl, heteroC₂₋₆alkynyl, C₃₋₁₀ carbocyclyl, 3-10 memberedheterocyclyl, C₆₋₁₀ aryl and 5-10 membered heteroaryl, or two R^(ff)groups are joined to form a 3-10 membered heterocyclyl or 5-10 memberedheteroaryl ring, wherein each alkyl, alkenyl, alkynyl, heteroalkyl,heteroalkenyl, heteroalkynyl, carbocyclyl, heterocyclyl, aryl, andheteroaryl is independently substituted with 0, 1, 2, 3, 4, or 5 R^(gg)groups; and

each instance of R^(gg) is, independently, halogen, —CN, —NO₂, —N₃,—SO₂H, —SO₃H, —OH, —OC₁₋₆ alkyl, —ON(C₁₋₆ alkyl)₂, —N(C₁₋₆ alkyl)₂,—N(C₁₋₆ alkyl)₃ ⁻X⁻, —NH(C₁₋₆ alkyl)₂ ⁺X⁻, —NH₂(C₁₋₆ alkyl)⁺X⁻, —NH₃⁺X⁻, —N(OC₁₋₆ alkyl)(C₁₋₆ alkyl), —N(OH)(C₁₋₆ alkyl), —NH(OH), —SH,—SC₁₋₆ alkyl, —SS(C₁₋₆ alkyl), —C(═O)(C₁₋₆ alkyl), —CO₂H, —CO₂(C₁₋₆alkyl), —OC(═O)(C₁₋₆ alkyl), —OCO₂(C₁₋₆ alkyl), —C(═ONH)NH₂, —C(═O)(C₁₋₆alkyl)₂, —OC(═O)NH(C₁₋₆ alkyl), —NHC(═O)(C₁₋₆ alkyl), —N(C₁₋₆alkyl)C(═O)(C₁₋₆ alkyl), —NHCO₂(C₁₋₆ alkyl), —NHC(═O)N(C₁₋₆ alkyl)₂,—NHC(═O)NH(C₁₋₆ alkyl), —NHC(═O)NH₂, —C(═NH)O(C₁₋₆ alkyl), —OC(═NH)(C₁₋₆alkyl), —OC(═NH)OC₁₋₆ alkyl, —C(═NH)N(C₁₋₆ alkyl)₂, —C(═NH)NH(C₁₋₆alkyl), —C(═NH)NH₂, —OC(═NH)N(C₁₋₆ alkyl)₂, —OC(NH)NH(C₁₋₆ alkyl),—OC(NH)NH₂, —NHC(NH)N(C₁₋₆ alkyl)₂, —NHC(═NH)NH₂, —NHSO₂(C₁₋₆ alkyl),—SO₂N(C₁₋₆ alkyl)₂, —SO₂NH(C₁₋₆ alkyl), —SO₂NH₂, —SO₂C₁₋₆ alkyl,—SO₂OC₁₋₆ alkyl, —OSO₂C₁₋₆ alkyl, —SOC₁₋₆ alkyl, —Si(C₁₋₆ alkyl)₃,—OSi(C₁₋₆ alkyl)₃-C(═S)N(C₁₋₆ alkyl)₂, C(═S)NH(C₁₋₆ alkyl), C(═S)NH₂,—C(═O)S(C₁₋₆ alkyl), —C(═S)SC₁₋₆ alkyl, —SC(═S)SC₁₋₆ alkyl, —P(═O)(OC₁₋₆alkyl)₂, —P(═O)(C₁₋₆ alkyl)₂, —OP(═O)(C₁₋₆ alkyl)₂, —OP(═O)(OC₁₋₆alkyl)₂, C₁₋₆ alkyl, C₁₋₆ perhaloalkyl, C₂₋₆ alkenyl, C₂₋₆ alkynyl,heteroC₁₋₆alkyl, heteroC₂₋₆alkenyl, heteroC₂₋₆alkynyl, C₃₋₁₀carbocyclyl, C₆₋₁₀ aryl, 3-10 membered heterocyclyl, 5-10 memberedheteroaryl; or two geminal R^(gg) substituents can be joined to form ═Oor ═S; wherein X⁻ is a counterion. Alternatively, two geminal hydrogenson a carbon atom are replaced with the group ═O, ═S, ═NN(R^(bb))₂,—NNR^(bb)C(═O)R^(aa), —NNR^(bb)C(═O)OR^(aa), —NNR^(bb)S(═O)₂R^(aa),—NR^(bb), or ═NOR^(cc); wherein each alkyl, alkenyl, alkynyl,heteroalkyl, heteroalkenyl, heteroalkynyl, carbocyclyl, heterocyclyl,aryl, and heteroaryl is independently substituted with 0, 1, 2, 3, 4, or5 R^(dd) groups; wherein X⁻ is a counterion;

wherein:

each instance of R^(aa) is, independently, selected from C₁₋₁₀ alkyl,C₁₋₁₀ perhaloalkyl, C₂₋₁₀ alkenyl, C₂₋₁₀ alkynyl, heteroC₁₋₁₀ alkyl,heteroC₂₋₁₀alkenyl, heteroC₂₋₁₀alkynyl, C₃₋₁₀ carbocyclyl, 3-14 memberedheterocyclyl, C₆₋₁₄ aryl, and 5-14 membered heteroaryl, or two R^(aa)groups are joined to form a 3-14 membered heterocyclyl or 5-14 memberedheteroaryl ring, wherein each alkyl, alkenyl, alkynyl, heteroalkyl,heteroalkenyl, heteroalkynyl, carbocyclyl, heterocyclyl, aryl, andheteroaryl is independently substituted with 0, 1, 2, 3, 4, or 5 R^(dd)groups;

each instance of R^(bb) is, independently, selected from hydrogen, —OH,—OR^(aa), —N(R^(cc))₂, —CN, —C(═O)R^(aa), —C(═O)N(R^(cc)), —CO₂R^(aa),—SO₂R^(aa), —C(═NR^(cc))OR^(aa), —C(═NR^(cc))N(R^(cc))₂, —SO₂N(R^(cc))₂,—SO₂R^(cc), —SO₂OR^(cc), —SOR^(aa), —C(═S)N(R^(cc))₂, —C(═O)SR^(cc),—C(═S)SR^(cc), —P(═O)(R^(aa))₂, —P(═O)(OR^(cc))₂, —P(═O)(N(R^(cc))₂)₂,C₁₋₁₀ alkyl, C₁₋₁₀ perhaloalkyl, C₂₋₁₀ alkenyl, C₂₋₁₀ alkynyl,heteroC₁₋₁₀alkyl, heteroC₂₋₁₀alkenyl, heteroC₂₋₁₀alkynyl, C₃₋₁₀carbocyclyl, 3-14 membered heterocyclyl, C₆₋₁₄ aryl, and 5-14 memberedheteroaryl, or two R^(bb) groups are joined to form a 3-14 memberedheterocyclyl or 5-14 membered heteroaryl ring, wherein each alkyl,alkenyl, alkynyl, heteroalkyl, heteroalkenyl, heteroalkynyl,carbocyclyl, heterocyclyl, aryl, and heteroaryl is independentlysubstituted with 0, 1, 2, 3, 4, or 5 R^(dd) groups; wherein X is acounterion;

each instance of R^(cc) is, independently, selected from hydrogen, C₁₋₁₀alkyl, C₁₋₁₀ perhaloalkyl, C₂₋₁₀ alkenyl, C₂₋₁₀ alkynyl, heteroC₁₋₁₀alkyl, heteroC₂₋₁₀ alkenyl, heteroC₂₋₁₀ alkynyl, C₃₋₁₀ carbocyclyl, 3-14membered heterocyclyl, C₆₋₁₄ aryl, and 5-14 membered heteroaryl, or twoR^(cc) groups are joined to form a 3-14 membered heterocyclyl or 5-14membered heteroaryl ring, wherein each alkyl, alkenyl, alkynyl,heteroalkyl, heteroalkenyl, heteroalkynyl, carbocyclyl, heterocyclyl,aryl, and heteroaryl is independently substituted with 0, 1, 2, 3, 4, or5 R^(dd) groups;

each instance of R^(dd) is, independently, selected from halogen, —CN,—NO₂, —N₃, —SO₂H, —SO₃H, —OH, —OR^(ee), —ON(R^(ff))₂, —N(R^(ff))₂,—N(R)₃ ⁺X⁺, —N(OR^(ee))R^(ff), —SH, —SR^(ee), —SSR^(ee), —C(═O)R^(ee),—CO₂H, —CO₂R^(ee), —OC(═O)R^(ee), —OCO₂R^(ee), —C(═O)N(R^(ff))₂,—OC(═O)N(R^(ff))₂, —NR^(ff)C(═O)R^(ee), —NR^(ff)CO₂R^(ee),—NR^(ff)C(═O)N(R^(ff))₂, —C(═NR^(ff))OR^(ee), —OC(═NR^(ff))R^(ee),—OC(═NR^(ff))OR^(ee), —C(═NR^(ff))N(R^(ff))₂, —OC(═NR^(ff))N(R^(ff))₂,—NR^(ff)C(═NR^(ff))N(R^(ff))₂, —NR^(ff)SO₂R^(ee), —SO₂N(R^(ff))₂,—SO₂R^(ee), —SO₂OR^(ee), —OSO₂R^(ee), —S(═O)R^(ee), —Si(R^(ee))₃,—OSi(R^(ee))₃, —C(═S)N(R^(ff))₂, —C(═O)SR^(ee), —C(═S)SR^(ee),—SC(═S)SR^(ee), —P(═O)(OR^(ee))₂, —P(═O)(R^(ee))₂, —OP(═O)(R^(ee)),—OP(═O)(OR^(ee))₂, C₁₋₆ alkyl, C₁₋₆ perhaloalkyl, C₂₋₆ alkenyl, C₂₋₆alkynyl, heteroC₁₋₆alkyl, heteroC₂₋₆alkenyl, heteroC₂₋₆alkynyl, C₃₋₁₀carbocyclyl, 3-10 membered heterocyclyl, C₆₋₁₀ aryl, 5-10 memberedheteroaryl, wherein each alkyl, alkenyl, alkynyl, heteroalkyl,heteroalkenyl, heteroalkynyl, carbocyclyl, heterocyclyl, aryl, andheteroaryl is independently substituted with 0, 1, 2, 3, 4, or 5 R^(gg)groups, or two geminal R^(dd) substituents can be joined to form ═O or═S; wherein X⁻ is a counterion;

each instance of R^(ee) is, independently, selected from C₁₋₆ alkyl,C₁₋₆ perhaloalkyl, C₂₋₆ alkenyl, C₂₋₆ alkynyl, heteroC₁₋₆ alkyl,heteroC₂₋₆alkenyl, heteroC₂₋₆ alkynyl, C₃₋₁₀ carbocyclyl, C₆-10 aryl,3-10 membered heterocyclyl, and 3-10 membered heteroaryl, wherein eachalkyl, alkenyl, alkynyl, heteroalkyl, heteroalkenyl, heteroalkynyl,carbocyclyl, heterocyclyl, aryl, and heteroaryl is independentlysubstituted with 0, 1, 2, 3, 4, or 5 R^(gg) groups;

each instance of R^(ff) is, independently, selected from hydrogen, C₁₋₆alkyl, C₁₋₆ perhaloalkyl, C₂₋₆ alkenyl, C₂₋₆ alkynyl, heteroC₁₋₆alkyl,heteroC₂₋₆alkenyl, heteroC₂₋₆alkynyl, C₃₋₁₀ carbocyclyl, 3-10 memberedheterocyclyl, C₆₋₁₀ aryl and 5-10 membered heteroaryl, or two R^(f)groups are joined to form a 3-10 membered heterocyclyl or 5-10 memberedheteroaryl ring, wherein each alkyl, alkenyl, alkynyl, heteroalkyl,heteroalkenyl, heteroalkynyl, carbocyclyl, heterocyclyl, aryl, andheteroaryl is independently substituted with 0, 1, 2, 3, 4, or 5 R^(gg)groups; and

each instance of R^(gg) is, independently, halogen, —CN, —NO₂, —N₃,—SO₂H, —SO₃H, —OH, —OC₁₋₆ alkyl, —ON(C₁₋₆ alkyl)₂, —N(C₁₋₆ alkyl)₂,—N(C₁₋₆ alkyl)₃ ⁺X⁻, —NH(C₁₋₆ alkyl)₂ ⁺X⁻, —NH₂(C₁₋₆ alkyl)⁺X⁻, —NH₃⁺X⁻, —N(OC₁₋₆ alkyl)(C₁₋₆ alky, —NN(OH)(C₁₋₆ alkyl), —NH(OH), —SH,—SC₁₋₆ alkyl, —SS(C₁₋₆ alkyl), —C(═O)(C₁₋₆ alkyl)₂H, CO₂, —CO₂(C₁₋₆alkyl), —OC(═O)(C₁₋₆ alkyl), —OCO₂(C₁₋₆ alkyl), —C(═O)NH₂, —C(═O)N(C₁₋₆alkyl)₂, —OC(═C)N(C₁₋₆ alkyl)), —NHC(═O)(C₁₋₆ alkyl), —N(C₁₋₆alkyl)C(═O)(C₁₋₆ alkyl), —NHCO₂(C₁₋₆ alkyl), —NHC(═O)N(C₁₋₆ alkyl)₂,—NHC(═O)NH(C₁₋₆ alkyl), —NHC(═O)NH₂, —C(═NH)O(C₁₋₆ alkyl), —OC(═NH)(C₁₋₆alkyl), —OC(═NH)OC₁₋₆ alkyl, —C(═NH)N(C₁₋₆ alkyl)₂, —C(═NH)NH(C₁₋₆alkyl), —C(═NH)NH₂, —OC(═NH)N(C₁₋₆ alkyl)₂, —OC(NH)NH(C₁₋₆ alkyl),—OC(NH)NH₂, —NHC(NH)N(C₁₋₆ alkyl)₂, —NHC(═NH)NH₂, —NHSO₂(C₁₋₆ alkyl),—SO₂N(C₁₋₆ alkyl)₂, —SO₂NH(C₁₋₆ alkyl), —SO₂NH₂, —SO₂C₁₋₆ alkyl,—SO₂OC₁₋₆ alkyl, —OSO₂C₁₋₆ alkyl, —SOC₁₋₆ alkyl, —Si(C₁₋₆ alkyl)₃,—OSi(C₁₋₆ alkyl)₃-C(═S)N(C₁₋₆ alkyl)₂, C(═S)NH(C₁₋₆ alkyl), C(═S)NH₂,—C(═O)S(C₁₋₆ alkyl), —C(═S)SC₁₋₆ alkyl, —SC(═S)SC₁₋₆ alkyl, —P(═O)(OC₁₋₆alkyl)₂, —P(═O)(C₁₋₆ alkyl)₂, —OP(═O)(C₁₋₆ alkyl)₂, —OP(═O)(OC₁₋₆alkyl)₂, C₁₋₆ alkyl, C₁₋₆ perhaloalkyl, C₂₋₆ alkenyl, C₂₋₆ alkynyl,heteroC₁₋₆alkyl, heteroC₂₋₆alkenyl, heteroC₂₋₆alkynyl, C₃₋₁₀carbocyclyl, C₆₋₁₀ aryl, 3-10 membered heterocyclyl, 5-10 memberedheteroaryl; or two geminal R^(gg) substituents can be joined to form ═Oor ═S; wherein X⁻ is a counterion.

A “counterion” or “anionic counterion” is a negatively charged groupassociated with a positively charged group in order to maintainelectronic neutrality. An anionic counterion may be monovalent (i.e.,including one formal negative charge). An anionic counterion may also bemultivalent (i.e., including more than one formal negative charge), suchas divalent or trivalent. Exemplary counterions include halide ions(e.g., F⁻, Cl⁻, Br⁻, I⁻), NO₃ ⁻, ClO₄ ⁻, OH⁻, H₂PO₄ ⁻, HCO₃ ⁻, HSO₄ ⁻,sulfonate ions (e.g., methansulfonate, trifluoromethanesulfonate,p-toluenesulfonate, benzenesulfonate, 10-camphor sulfonate,naphthalene-2-sulfonate, naphthalene-1-sulfonic acid-5-sulfonate,ethan-1-sulfonic acid-2-sulfonate, and the like), carboxylate ions(e.g., acetate, propanoate, benzoate, glycerate, lactate, tartrate,glycolate, gluconate, and the like), BF₄ ⁻, PF₄ ⁻, PF₆ ⁻, AsF₆ ⁻, SbF₆⁻, B[3,5-(CF₃)₂C₆H₃]₄]⁻, B(C₆F₅)₄ ⁻, BPh₄ ⁻, Al(OC(CF₃)₃)₄ ⁻, andcarborane anions (e.g., CB₁₁H₁₂ ⁻ or (HCB₁₁Me₅Br₆)⁻). Exemplarycounterions which may be multivalent include CO₃ ²⁻, HPO₄ ²⁻, PO₄ ³⁻,B₄O₇ ²⁻, SO₄ ²⁻, S₂O₃ ²⁻, carboxylate anions (e.g., tartrate, citrate,fumarate, maleate, malate, malonate, gluconate, succinate, glutarate,adipate, pimelate, suberate, azelate, sebacate, salicylate, phthalates,aspartate, glutamate, and the like), and carboranes.

The term “pharmaceutically acceptable salt” refers to those salts whichare, within the scope of sound medical judgment, suitable for use incontact with the tissues of humans and lower animals without unduetoxicity, irritation, allergic response and the like, and arecommensurate with a reasonable benefit/risk ratio. Pharmaceuticallyacceptable salts are well known in the art. For example, Berge et al.,describe pharmaceutically acceptable salts in detail in J.Pharmaceutical Sciences, 1977, 66, 1-19, incorporated by reference.Pharmaceutically acceptable salts of the compounds disclosed in thisapplication include those derived from suitable inorganic and organicacids and bases. Examples of pharmaceutically acceptable, nontoxic acidaddition salts are salts of an amino group formed with inorganic acidssuch as hydrochloric acid, hydrobromic acid, phosphoric acid, sulfuricacid, and perchloric acid or with organic acids such as acetic acid,oxalic acid, maleic acid, tartaric acid, citric acid, succinic acid, ormalonic acid or by using other methods known in the art such as ionexchange. Other pharmaceutically acceptable salts include adipate,alginate, ascorbate, aspartate, benzenesulfonate, benzoate, bisulfate,borate, butyrate, camphorate, camphorsulfonate, citrate,cyclopentanepropionate, digluconate, dodecylsulfate, ethanesulfonate,formate, fumarate, glucoheptonate, glycerophosphate, gluconate,hemisulfate, heptanoate, hexanoate, hydroiodide,2-hydroxy-ethanesulfonate, lactobionate, lactate, laurate, laurylsulfate, malate, maleate, malonate, methanesulfonate,2-naphthalenesulfonate, nicotinate, nitrate, oleate, oxalate, palmitate,pamoate, pectinate, persulfate, 3-phenylpropionate, phosphate, picrate,pivalate, propionate, stearate, succinate, sulfate, tartrate,thiocyanate, p-toluenesulfonate, undecanoate, valerate salts, and thelike. Salts derived from appropriate bases include alkali metal,alkaline earth metal, ammonium and N⁺(C₁₋₄ alkyl)₄ ⁻ salts.Representative alkali or alkaline earth metal salts include sodium,lithium, potassium, calcium, magnesium, and the like. Furtherpharmaceutically acceptable salts include, when appropriate, nontoxicammonium, quaternary ammonium, and amine cations formed usingcounterions such as halide, hydroxide, carboxylate, sulfate, phosphate,nitrate, lower alkyl sulfonate, and aryl sulfonate.

The term “solvate” refers to forms of a compound that are associatedwith a solvent, usually by a solvolysis reaction. This physicalassociation may include hydrogen bonding. Conventional solvents includewater, methanol, ethanol, acetic acid, DMSO, THF, diethyl ether, and thelike. The compounds of Formula (1), (9), (10), and (11) may be prepared,e.g., in crystalline form, and may be solvated. Suitable solvatesinclude pharmaceutically acceptable solvates and further include bothstoichiometric solvates and non-stoichiometric solvates. In certaininstances, the solvate will be capable of isolation, for example, whenone or more solvent molecules are incorporated in the crystal lattice ofa crystalline solid. “Solvate” encompasses both solution-phase andisolable solvates. Representative solvates include hydrates,ethanolates, and methanolates.

The term “hydrate” refers to a compound that is associated with water.Typically, the number of the water molecules contained in a hydrate of acompound is in a definite ratio to the number of the compound moleculesin the hydrate. Therefore, a hydrate of a compound may be represented,for example, by the general formula R.x.H₂O, wherein R is the compoundand wherein x is a number greater than 0. A given compound may form morethan one type of hydrates, including, e.g., monohydrates (x is 1), lowerhydrates (x is a number greater than 0 and smaller than 1, e.g.,hemihydrates (R.0.5 H₂O)), and polyhydrates (x is a number greater than1, e.g., dihydrates (R.2 H₂O) and hexahydrates (R.6 H₂O)).

The term “tautomers” refer to compounds that are interchangeable formsof a particular compound structure, and that vary in the displacement ofhydrogen atoms and electrons. Thus, two structures may be in equilibriumthrough the movement of π electrons and an atom (usually H). Forexample, enols and ketones are tautomers because they are rapidlyinterconverted by treatment with either acid or base. Another example oftautomerism is the aci- and nitro-forms of phenylnitromethane, which arelikewise formed by treatment with acid or base. Tautomeric forms may berelevant to the attainment of the optimal chemical reactivity andbiological activity of a compound of interest.

It is also to be understood that compounds that have the same molecularformula but differ in the nature or sequence of bonding of their atomsor the arrangement of their atoms in space are termed “isomers.” Isomersthat differ in the arrangement of their atoms in space are termed“stereoisomers.”

Stereoisomers that are not mirror images of one another are termed“diastereomers” and those that are non-superimposable mirror images ofeach other are termed “enantiomers.” When a compound has an asymmetriccenter, for example, it is bonded to four different groups, a pair ofenantiomers is possible. An enantiomer can be characterized by theabsolute configuration of its asymmetric center and described by the R-and S-sequencing rules of Cahn and Prelog. An enantiomer can also becharacterized by the manner in which the molecule rotates the plane ofpolarized light, and designated as dextrorotatory or levorotatory (i.e.,as (+) or (−)-isomers respectively). A chiral compound can exist aseither an individual enantiomer or as a mixture of enantiomers. Amixture containing equal proportions of the enantiomers is called a“racemic mixture.”

The term “co-crystal” refers to a crystalline structure comprising atleast two different components (e.g., a compound described in thisapplication and an acid), wherein each of the components isindependently an atom, ion, or molecule. In certain embodiments, none ofthe components is a solvent. In certain embodiments, at least one of thecomponents is a solvent. A co-crystal of a compound and an acid isdifferent from a salt formed from a compound and the acid. In the salt,a compound described in this application is complexed with the acid in away that proton transfer (e.g., a complete proton transfer) from theacid to a compound described in this application easily occurs at roomtemperature. In the co-crystal, however, a compound described in thisapplication is complexed with the acid in a way that proton transferfrom the acid to a compound described in this application does noteasily occur at room temperature. In certain embodiments, in theco-crystal, there is no proton transfer from the acid to a compounddescribed in this application. In certain embodiments, in theco-crystal, there is partial proton transfer from the acid to a compounddescribed in this application. Co-crystals may be useful to improve theproperties (e.g., solubility, stability, and ease of formulation) of acompound described in this application.

The term “polymorphs” refers to a crystalline form of a compound (or asalt, hydrate, or solvate thereof) in a particular crystal packingarrangement. All polymorphs of the same compound have the same elementalcomposition. Different crystalline forms usually have different X-raydiffraction patterns, infrared spectra, melting points, density,hardness, crystal shape, optical and electrical properties, stability,and solubility. Recrystallization solvent, rate of crystallization,storage temperature, and other factors may cause one crystal form todominate. Various polymorphs of a compound can be prepared bycrystallization under different conditions.

The term “prodrug” refers to compounds, including derivatives of thecompounds of Formula (X), (8), (9), (10), or (11), that have cleavablegroups and become by solvolysis or under physiological conditions thecompounds of Formula (X), (8), (9), (10), or (11) and that arepharmaceutically active in vivo. The prodrugs may have attributes suchas, without limitation, solubility, bioavailability, tissuecompatibility, or delayed release in a mammalian organism. Examplesinclude, but are not limited to, derivatives of compounds described inthis application, including derivatives formed from glycosylation of thecompounds described in this application (e.g., glycoside derivatives),carrier-linked prodrugs (e.g., ester derivatives), bioprecursor prodrugs(a prodrug metabolized by molecular modification into the activecompound), and the like. Non-limiting examples of glycoside derivativesare disclosed in and incorporated by reference from WO2018208875 andUS20190078168. Non-limiting examples of ester derivatives are disclosedin and incorporated by reference from US20170362195.

Other derivatives of the compounds of this invention have activity inboth their acid and acid derivative forms, but the acid sensitive formoften offers advantages of solubility, bioavailability, tissuecompatibility, or delayed release in a mammalian organism (see,Bundgard, H., Design of Prodrugs, pp. 7-9, 21-24, Elsevier, Amsterdam1985). Prodrugs include acid derivatives well known to practitioners ofthe art, such as, for example, esters prepared by reaction of the parentacid with a suitable alcohol, or amides prepared by reaction of theparent acid compound with a substituted or unsubstituted amine, or acidanhydrides, or mixed anhydrides. Simple aliphatic or aromatic esters,amides, and anhydrides derived from acidic groups pendant on thecompounds of this invention are particular prodrugs. In some cases it isdesirable to prepare double ester type prodrugs such as (acyloxy)alkylesters or ((alkoxycarbonyl)oxy)alkylesters. C₁-C₈ alkyl, C₂-C₈ alkenyl,C₂-C₈ alkynyl, aryl, C₇-C₁₂ substituted aryl, and C₇-C₁₂ arylalkylesters of the compounds of Formula (X), (8), (9), (10), or (11) may bepreferred.

Cannabinoids

As used in this application, the term “cannabinoid” includes compoundsof Formula (X):

or a pharmaceutically acceptable salt, co-crystal, tautomer,stereoisomer, solvate, hydrate, polymorph, isotopically enrichedderivative, or prodrug thereof, wherein R1 is hydrogen, optionallysubstituted acyl, optionally substituted alkyl, optionally substitutedalkenyl, optionally substituted alkynyl, optionally substitutedcarbocyclyl, or optionally substituted aryl; R2 and R6 are,independently, hydrogen or carboxyl; R3 and R5 are, independently,hydroxyl, halogen, or alkoxy; and R4 is a hydrogen or an optionallysubstituted prenyl moiety; or optionally R4 and R3 are taken togetherwith their intervening atoms to form a cyclic moiety, or optionally R4and R5 are taken together with their intervening atoms to form a cyclicmoiety, or optionally both 1) R4 and R3 are taken together with theirintervening atoms to form a cyclic moiety and 2) R4 and R5 are takentogether with their intervening atoms to form a cyclic moiety. Incertain embodiments, R4 and R3 are taken together with their interveningatoms to form a cyclic moiety. In certain embodiments, R4 and R5 aretaken together with their intervening atoms to form a cyclic moiety. Incertain embodiments, “cannabinoid” refers to a compound of Formula (X),or a pharmaceutically acceptable salt thereof. In certain embodiments,both 1) R4 and R3 are taken together with their intervening atoms toform a cyclic moiety and 2) R4 and R5 are taken together with theirintervening atoms to form a cyclic moiety.

In some embodiments, cannabinoids may be synthesized via the followingsteps: a) one or more reactions to incorporate three additional ketonemoieties onto an acyl-CoA scaffold, where the acyl moiety in theacyl-CoA scaffold comprises between four and fourteen carbons; b) areaction cyclizing the product of step (a); and c) a reaction toincorporate a prenyl moiety to the product of step (b) or a derivativeof the product of step (b). In some embodiments, non-limiting examplesof the acyl-CoA scaffold described in step (a) include hexanoyl-CoA andbutyryl-CoA. In some embodiments, non-limiting examples of the productof step (b) or a derivative of the product of step (b) includeolivetolic acid and divarinic acid.

In some embodiments, a cannabinoid compound of Formula (X) is of Formula(X-A), (X-B), or (X-C):

or a pharmaceutically acceptable salt, solvate, hydrate, polymorph,co-crystal, tautomer, stereoisomer, isotopically labeled derivative, orprodrug thereof, wherein

is a double bond or a single bond, as valency permits;

R is hydrogen, optionally substituted acyl, optionally substitutedalkyl, optionally substituted alkenyl, optionally substituted alkynyl,optionally substituted carbocyclyl, or optionally substituted aryl;

R^(Z1) is hydrogen, optionally substituted acyl, optionally substitutedalkyl, optionally substituted alkenyl, optionally substituted alkynyl,optionally substituted carbocyclyl, or optionally substituted aryl;

R^(Z2) is hydrogen, optionally substituted acyl, optionally substitutedalkyl, optionally substituted alkenyl, optionally substituted alkynyl,optionally substituted carbocyclyl, or optionally substituted aryl;

or optionally, R^(Z1) and R^(Z2) are taken together with theirintervening atoms to form an optionally substituted carbocyclic ring;

R^(3A) is hydrogen, optionally substituted acyl, optionally substitutedalkyl, optionally substituted alkenyl, or optionally substitutedalkynyl;

R^(3B) is hydrogen, optionally substituted acyl, optionally substitutedalkyl, optionally substituted alkenyl, or optionally substitutedalkynyl;

R^(Y) is hydrogen, optionally substituted acyl, optionally substitutedalkyl, optionally substituted alkenyl, or optionally substitutedalkynyl;

R^(Z) is hydrogen, optionally substituted acyl, optionally substitutedalkyl, optionally substituted alkenyl, or optionally substitutedalkynyl.

In certain embodiments, a cannabinoid compound is of Formula (X-A):

wherein

is a double bond, and each of R^(Z1) and R^(Z2) is hydrogen, one ofR^(3A) and R^(3B) is optionally substituted C₂₋₆ alkenyl, and the otherone of R^(3A) and R^(3B) is optionally substituted C₂₋₆ alkyl. In someembodiments, a cannabinoid compound of Formula (X) is of Formula (X-A),wherein each of R^(Z1) and R^(Z2) is hydrogen, one of R^(3A) and R^(3B)is a prenyl group, and the other one of R^(3A) and R^(3B) is optionallysubstituted methyl.

In certain embodiments, a cannabinoid compound of Formula (X) of Formula(X-A) is of Formula (11-z):

wherein

is a double bond or single bond, as valency permits; one of R^(3A) andR^(3B) is C₁₋₆ alkyl optionally substituted with alkenyl, and the otherof R^(3A) and R^(3B) is optionally substituted C₁₋₆ alkyl. In certainembodiments, in a compound of Formula (11-z),

is a single bond; one of R^(3A) and R^(3B) is C₁₋₆ alkyl optionallysubstituted with prenyl; and the other of one of R^(3A) and R^(3B) isunsubstituted methyl; and R is as described in this application. Incertain embodiments, in a compound of Formula (11-z),

is a single bond; one of R^(3A) and R^(3B) is

and the other of one of R^(3A) and R^(3B) is unsubstituted methyl; and Ris as described in this application. In certain embodiments, acannabinoid compound of Formula (11-z) is of Formula (11a):

In certain embodiments, a cannabinoid compound of Formula (X-A) is ofFormula (10-z):

wherein

is a double bond or single bond, as valency permits; R^(Y) is hydrogen,optionally substituted acyl, optionally substituted alkyl, optionallysubstituted alkenyl, or optionally substituted alkynyl; and each ofR^(3A) and R^(3B) is independently optionally substituted C₁₋₆ alkyl. Incertain embodiments, in a compound of Formula (10-z),

is a single bond; each of R^(3A) and R^(3B) is unsubstituted methyl, andR is as described in this application. In certain embodiments, acannabinoid compound of Formula (10-z) is of Formula (10a):

In certain embodiments, a compound of Formula (10a)

has a chiral atom labeled with * at carbon 10 and a chiral atom labeledwith ** at carbon 6. In certain embodiments, in a compound of Formula(10a)

the chiral atom labeled with * at carbon 10 is of the R-configuration orS-configuration; and a chiral atom labeled with ** at carbon 6 is of theR-configuration. In certain embodiments, in a compound of Formula (10a)

the chiral atom labeled with * at carbon 10 is of the S-configuration;and a chiral atom labeled with ** at carbon 6 is of the R-configurationor S-configuration. In certain embodiments, in a compound of Formula(10a)

the chiral atom labeled with * at carbon 10 is of the R-configurationand a chiral atom labeled with ** at carbon 6 is of the R-configuration.In certain embodiments, a compound of Formula (10a)

is of the formula:

In certain embodiments, in a compound of Formula (10a)

the chiral atom labeled with * at carbon 10 is of the S-configurationand a chiral atom labeled with ** at carbon 6 is of the S-configuration.In certain embodiments, a compound of Formula (10a)

is of the formula:

In certain embodiments, a cannabinoid compound is of Formula (X-B):

wherein

is a double bond; R is hydrogen, optionally substituted acyl, optionallysubstituted alkyl, optionally substituted alkenyl, or optionallysubstituted alkynyl; and each of R^(3A) and R^(3B) is independentlyoptionally substituted C1-6 alkyl. In certain embodiments, in a compoundof Formula (X-B), R is optionally substituted C₁₋₆ alkyl; one of R^(3A)and R^(3B) is

; and the other one of R^(3A) and R^(3B) is unsubstituted methyl, and Ris as described in this application. In certain embodiments, a compoundof Formula (X-B) is of Formula (9a):

In certain embodiments, a compound of Formula (9a)

has a chiral atom labeled with * at carbon 3 and a chiral atom labeledwith ** at carbon 4. In certain embodiments, in a compound of Formula(9a)

the chiral atom labeled with * at carbon 3 is of the R-configuration orS-configuration; and a chiral atom labeled with ** at carbon 4 is of theR-configuration. In certain embodiments, in a compound of Formula (9a)

the chiral atom labeled with * at carbon 3 is of the S-configuration;and a chiral atom labeled with ** at carbon 4 is of the R-configurationor S-configuration. In certain embodiments, in a compound of Formula(9a)

the chiral atom labeled with * at carbon 3 is of the R-configuration anda chiral atom labeled with ** at carbon 4 is of the R-configuration. Incertain embodiments, a compound of Formula (9a)

is of the formula:

In certain embodiments, in a compound of Formula (9a)

the chiral atom labeled with * at carbon 3 is of the S-configuration anda chiral atom labeled with ** at carbon 4 is of the S-configuration. Incertain embodiments, a compound of Formula (9a)

is of the formula:

In certain embodiments, a cannabinoid compound is of Formula (X-C):

wherein R^(Z) is optionally substituted alkyl or optionally substitutedalkenyl. In certain embodiments, a compound of Formula (X-C) is offormula:

wherein a is 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10. In certain embodiments, ais 1. In certain embodiments, a is 2. In certain embodiments, a is 3. Incertain embodiments, a is 1, 2, or 3 for a compound of Formula (X-C). Incertain embodiments, a cannabinoid compound is of Formula (X-C), and ais 1, 2, 3, 4, or 5. In certain embodiments, a compound of Formula (X-C)is of Formula (8a): (8a).

In some embodiments, cannabinoids of the present disclosure comprisecannabinoid receptor ligands. Cannabinoid receptors are a class of cellmembrane receptors in the G protein-coupled receptor superfamily.Cannabinoid receptors include the CB₁ receptor and the CB₂ receptor. Insome embodiments, cannabinoid receptors comprise GPR18, GPR55, and PPAR.(See Bram et al. “Activation of GPR18 by cannabinoid compounds: a taleof biased agonism” Br J Pharmcol v171 (16) (2014); Shi et al. “The novelcannabinoid receptor GPR55 mediates anxiolytic-like effects in themedial orbital cortex of mice with acute stress” Molecular Brain 10, No.38 (2017); and O'Sullvan, Elizabeth. “An update on PPAR activation bycannabinoids” Br J Pharmcol v. 173(12) (2016)).

In some embodiments, cannabinoids comprise endocannabinoids, which aresubstances produced within the body, and phytocannabinoids, which arecannabinoids that are naturally produced by plants of genus Cannabis. Insome embodiments, phytocannabinoids comprise the acidic anddecarboxylated acid forms of the naturally-occurring plant-derivedcannabinoids, and their synthetic and biosynthetic equivalents.

Over 94 phytocannabinoids have been identified to date (Berman, Paula,et al. “A new ESI-LC/MS approach for comprehensive metabolic profilingof phytocannabinoids in Cannabis.” Scientific reports 8.1 (2018): 14280;El-Alfy et al., 2010, “Antidepressant-like effect ofdelta-9-tetrahydrocannabinol and other cannabinoids isolated fromCannabis sativa L”, Pharmacology Biochemistry and Behavior 95 (4):434-42; Rudolf Brenneisen, 2007, Chemistry and Analysis ofPhytocannabinoids, each of which is incorporated by reference in thisapplication in its entirety). In some embodiments, cannabinoids compriseΔ⁹-tetrahydrocannabinol (THC) type (e.g.,(−)-trans-delta-9-tetrahydrocannabinol or dronabinol,(+)-trans-delta-9-tetrahydrocannabinol,(−)-cis-delta-9-tetrahydrocannabinol, or(+)-cis-delta-9-tetrahydrocannabinol), cannabidiol (CBD) type,cannabigerol (CBG) type, cannabichromene (CBC) type, cannabicyclol (CBL)type, cannabinodiol (CBND) type, or cannabitriol (CBT) typecannabinoids, or any combination thereof (see, e.g., R Pertwee, ed,Handbook of Cannabis (Oxford, UK: Oxford University Press, 2014)), whichis incorporated by reference in this application in its entirety). Anon-limiting list of cannabinoids comprises: cannabiorcol-C1 (CBNO),CBND-C1 (CBNDO), Δ⁹-trans-Tetrahydrocannabiorcolic acid-C1 (Δ⁹-THCO),Cannabidiorcol-C1 (CBDO), Cannabiorchromene-C1 (CBCO),(−)-Δ⁸-trans-(6aR,10aR)-Tetrahydrocannabiorcol-C1 (Δ⁸-THCO),Cannabiorcyclol C1 (CBLO), CBG-C1 (CBGO), Cannabinol-C2 (CBN-C2),CBND-C2, Δ⁹-THC-C2, CBD-C2, CBC-C2, Δ⁸-THC-C2, CBL-C2,Bisnor-cannabielsoin-C1 (CBEO), CBG-C2, Cannabivarin-C3 (CBNV),Cannabinodivarin-C3 (CBNDV), (−)-Δ⁹-trans-Tetrahydrocannabivarin-C3(Δ⁹-THCV), (−)-Cannabidivarin-C3 (CBDV), (±)-Cannabichromevarin-C3(CBCV), (−)-Δ⁸-trans-THC-C3 (Δ⁸-THCV),(±)-(1aS,3aR,8bR,8cR)-Cannabicyclovarin-C3 (CBLV),2-Methyl-2-(4-methyl-2-pentenyl)-7-propyl-2H-1-benzopyran-5-ol,Δ⁷-tetrahydrocannabivarin-C3 (Δ⁷-THCV), CBE-C2, Cannabigerovarin-C3(CBGV), Cannabitriol-C1 (CBTO), Cannabinol-C4 (CBN-C4), CBND-C4,(−)-Δ⁹-trans-Tetrahydrocannabinol-C4 (Δ⁹-THC-C4), Cannabidiol-C4(CBD-C4), CBC-C4, (−)-trans-Δ⁸-THC-C4, CBL-C4, Cannabielsoin-C3 (CBEV),CBG-C4, CBT-C2, Cannabichromanone-C3, Cannabiglendol-C3(OH-iso-HHCV-C3), Cannabioxepane-C5 (CBX), Dehydrocannabifuran-C5(DCBF), Cannabinol-C5 (CBN), Cannabinodiol-C5 (CBND),(−)-Δ⁹-trans-Tetrahydrocannabinol-C5 (Δ⁹-THC),(−)-Δ⁸-trans-(6aR,10aR)-Tetrahydrocannabinol-C5 (Δ⁸-THC),(±)-Cannabichromene-C5 (CBC), (−)-Cannabidiol-C5 (CBD),(±)-(1aS,3aR,8bR,8cR)-Cannabicyclol-C5 (CBL), Cannabicitran-C5 (CBR),(−)-Δ⁹-(6aS,10aR-cis)-Tetrahydrocannabinol-C5 ((−)-cis-Δ⁹-THC),(−)-Δ⁷-trans-(1R,3R,6R)-Isotetrahydrocannabinol-C5 (trans-isoΔ⁷-THC),CBE-C4, Cannabigerol-C5 (CBG), Cannabitriol-C3 (CBTV), Cannabinol methylether-C5 (CBNM), CBNDM-C5, 8-OH-CBN-C5 (OH-CBN), OH-CBND-C5 (OH-CBND),10-Oxo-Δ^(6a(10a))-Tetrahydrocannabinol-C5 (OTHC), CannabichromanoneD-C5, Cannabicoumaronone-C5 (CBCON-C5), Cannabidiol monomethyl ether-C5(CBDM), Δ⁹-THCM-C5, (±)-3″-hydroxy-Δ⁴″-cannabichromene-C5,(5aS,6S,9R,9aR)-Cannabielsoin-C5 (CBE),2-geranyl-5-hydroxy-3-n-pentyl-1,4-benzoquinone-C5, 5-geranyl olivetolicacid, 5-geranyl olivetolate, 8α-Hydroxy-Δ⁹-Tetrahydrocannabinol-C5(8α-OH-Δ⁹-THC), 8β-Hydroxy-Δ⁹-Tetrahydrocannabinol-C5 (8β-OH-Δ⁹-THC),10α-Hydroxy-Δ⁸-Tetrahydrocannabinol-C5 (10α-OH-Δ⁸-THC),10β-Hydroxy-Δ⁸-Tetrahydrocannabinol-C5 (10β-OH-Δ⁸-THC),10α-hydroxy-Δ^(9,11)-hexahydrocannabinol-C5,9β,10β-Epoxyhexahydrocannabinol-C5, OH-CBD-C5 (OH-CBD), Cannabigerolmonomethyl ether-C5 (CBGM), Cannabichromanone-C5, CBT-C4,(±)-6,7-cis-epoxycannabigerol-C5, (±)-6,7-trans-epoxycannabigerol-C5,(−)-7-hydroxycannabichromane-C5, Cannabimovone-C5,(−)-trans-Cannabitriol-C5 ((−)-trans-CBT), (+)-trans-Cannabitriol-C5((+)-trans-CBT), (±)-cis-Cannabitriol-C5 ((±)-cis-CBT),(−)-trans-1-Ethoxy-9-hydroxy-Δ^(6a(10a))-tetrahydrocannabivarin-C3[(−)-trans-CBT-OEt],(−)-(6aR,9S,10S,10aR)-9,10-Dihydroxyhexahydrocannabinol-C5[(−)-Cannabiripsol] (CBR), Cannabichromanone C-C5,(−)-6a,7,10a-Trihydroxy-Δ⁹-tetrahydrocannabinol-C5 [(−)-Cannabitetrol](CBTT), Cannabichromanone B-C5,8,9-Dihydroxy-Δ^(6a(10a))-tetrahydrocannabinol-C5 (8,9-Di-OHCBT),(±)-4-acetoxycannabichromene-C5,2-acetoxy-6-geranyl-3-n-pentyl-1,4-benzoquinone-C5, 11-Acetoxy-Δ9-TetrahydrocannabinolC5 (11-OAc-Δ 9-THC),5-acetyl-4-hydroxycannabigerol-C5,4-acetoxy-2-geranyl-5-hydroxy-3-npentylphenol-C5,(−)-trans-10-Ethoxy-9-hydroxy-Δ^(6a(10a))-tetrahydrocannabinol-C5((−)-trans-CBTOEt), sesquicannabigerol-C5 (SesquiCBG), carmagerol-C5,4-terpenyl cannabinolate-C5, β-fenchyl-Δ⁹-tetrahydrocannabinolate-C5,α-fenchyl-Δ⁹-tetrahydrocannabinolate-C5,epi-bornyl-Δ⁹-tetrahydrocannabinolate-C5,bornyl-Δ⁹-tetrahydrocannabinolate-C5,α-terpenyl-Δ⁹-tetrahydrocannabinolate-C5,4-terpenyl-Δ⁹-tetrahydrocannabinolate-C5,6,6,9-trimethyl-3-pentyl-6H-dibenzo[b,d]pyran-1-ol,3-(1,1-dimethylheptyl)-6,6a,7,8,10,10a-hexahydro-1-hydroxy-6,6-dimethyl-9H-dibenzo[b,d]pyran-9-one,(−)-(3S,4S)-7-hydroxy-A-tetrahydrocannabinol-1,1-dimethylheptyl,(+)-(3S,4S)-7-hydroxy-Δ⁶-tetrahydrocannabinol-1,1-dimethylheptyl,l1-hydroxy-Δ⁹-tetrahydrocannabinol, and Δ⁸-tetrahydrocannabinol-11-oicacid)); certain piperidine analogs (e g.,(−)-(6S,6aR,9R,10aR)-5,6,6a,7,8,9,10,10a-octahydro-6-methyl-3-[(R)-1-methyl-4-phenylbutoxy]-1,9-phenanthridinediol1-acetate)), certain aminoalkylindole analogs (e.g.,(R)-(+)-[2,3-dihydro-5-methyl-3-(4-morpholinylmethyl)-pyrrolo[1,2,3-de]-1,4-benzoxazin-6-yl]-1-naphthalenyl-methanone),certain open pyran ring analogs (e.g.,2-[3-methyl-6-(1-methylethenyl)-2-cyclohexen-1-yl]-5-pentyl-1,3-benzenedioland4-(1,1-dimethylheptyl)-2,3′-dihydroxy-6′alpha-(3-hydroxypropyl)-1′,2′,3′,4′,5′,6′-hexahydrobiphenyl,tetrahydrocannabiphorol (THCP), cannabidiphorol (CBDP), CBGP, CBCP,their acidic forms, salts of the acidic forms, or any combinationthereof.

A cannabinoid described in this application can be a rare cannabinoid.For example, in some embodiments, a cannabinoid described in thisapplication corresponds to a cannabinoid that is naturally produced inconventional Cannabis varieties at concentrations of less than 10%, 9%,8%, 7%, 6%, 5%, 4%, 3%, 2%, 1%, 0.9%, 0.8%, 0.7%, 0.6% 0.5%, 0.25%, or0.1% by dry weight of the female flower. In some embodiments, rarecannabinoids include CBGA, CBGVA, THCVA, CBDVA, CBCVA, and CBCA. In someembodiments, rare cannabinoids are cannabinoids that are not THCA, THC,CBDA or CBD.

A cannabinoid described in this application can also be a non-rarecannabinoid.

In some embodiments, the cannabinoid is selected from the cannabinoidslisted in Table 1.

TABLE 1 Non-limiting examples of cannabinoids according to the presentdisclosure.

Δ⁹-Tetrahydrocannabinol Δ⁹-THC-C₅

Δ⁹-Tetrahydrocannabinol-C₄ Δ⁹-THC-C₄

Δ⁹-Tetrahydrocannabivarin Δ⁹-THCV-C₃

Δ⁹-Tetrahydrocannabiorcol Δ⁹-THCO-C₁

(−)-(6aS,10aR)-Δ⁹- Tetrahydrocannabinol (−)-cis-Δ⁹-THC-C₅

Δ⁹-Tetrahydro- cannabinolic acid A Δ⁹-THCA-C₅ A

Δ⁹-Tetrahydro- cannabinolic acid B Δ⁹-THCA-C₅ B

Δ⁹-Tetrahydro- cannabinolic acid-C₄ A and/or B Δ⁹-THCA-C₄ A and/or B

Δ⁹-Tetrahydro- cannabivarinic acid A Δ⁹-THCVA-C₃ A

Δ⁹-Tetrahydro- cannabiorcolic acid A and/or B Δ⁹-THCOA-C₁ A and/or B

(−)-Δ⁸-trans- (6aR,10aR)- Δ⁸- Tetrahydrocannabinol Δ⁸-THC-C₅

(−)-Δ⁸-trans- (6aR,10aR)- Tetrahydrocannabinolic

(−)-Cannabidiol CBD-C5

Cannabidiol momomethyl ether CBDM-C5

Cannabidiol-C4 CBD-C4

Cannabidiolic acid CBDA-C5

Cannabidivarinic acid CBDVA-C3

(−)-Cannabidivarin CBDV-C3

Cannabidiorcol CBD-C1

Cannabigerolic acid A (E)-CBGA-C₅ A

Cannabigerol (E)-CBG-C₅

Cannabigerol monomethyl ether (E)-CBGM-C₅ A

Cannabinerolic acid A (Z)-CBGA-C₅ A

Cannabigerovarin (E)-CBGV-C₃

Cannabigerol (E)-CBG-C₅

Cannabigerolic acid A (E)-CBGA-C₅ A

Cannabigerolic acid A monomethyl ether (E)-CBGAM-C₅ A

Cannabigerovarinic acid A (E)-CBGVA-C₃ A

Cannabinolic acid A CBNA-C5 A

Cannabinol methyl ether CBNM-C5

Cannabinol CBN-C5

Cannabinol-C4 CBN-C4

Cannabivarin CBN-C3

Cannabinol-C2 CBN-C2

Cannabiorcol CBN-C1

(±)- Cannabichromene CBC-C₅

(±)-Cannabichromenic acid A CBCA-C₅ A

(±)- Cannabivarichromene, (±)- Cannabichromevarin CBCV-C₃

(±)- Cannabichromevarinic acid A CBCVA-C₃ A

(±)- Cannabichromene CBC-C₅

(±)- (1aS,3aR,8bR,8cR)- Cannabicyclol CBL-C₅

(±)-(1aS,3aR,8bR,8cR)- Cannabicyclolic acid A CBLA-C₅ A

(±)-(1aS,3aR,8bR,8cR)- Cannabicyclovarin CBLV-C₃

(−)-(9R,10R)-trans- 10-O-Ethyl- cannabitriol (−)-trans-CBT-OEt-C5

(±)-(9R,10R/9S,10S)- Cannabitriol-C3 (±)-trans-CBT-C3

(−)-(9R,10R)-trans- Cannabitriol (−)-trans-CBT-C5

(+)-(9S,10S)- Cannabitriol (+)-trans-CBT-C5

(±)-(9R,10S/9S,10R)- Cannabitriol (±)-cis-CBT-C5

(−)-6a,7,10a- Trihydroxy- Δ9- tetrahydrocannabinol (−)-Cannabitetrol

10-Oxo-Δ6a(10a)- tetrahydrocannabinol OTHC

8,9-Dihydroxy- Δ6a(10a)- tetrahydrocannabinol 8,9-Di-OH-CBT-C5

Cannabidiolic acid A cannabitriol ester CBDA-C5 9-OH-CBT-C5 ester

(−)-(6aR,9S,10S,10aR)- 9,10-Dihydroxy- hexahydrocannabinol,Cannabiripsol Cannabiripsol-C5

(5aS,6S,9R,9aR)- Cannabielsoic acid B CBEA-C5 B

(5aS,6S,9R,9aR)- C3-Cannabielsoic acid B CBEA-C3 B

(5aS,6S,9R,9aR)- Cannabielsoin CBE-C5

(5aS,6S,9R,9aR)- C3-Cannabielsoin CBE-C3

(5aS,6S,9R,9aR)- Cannabielsoic acid A CBEA-C5 A

Cannabiglendol-C3 OH-iso-HHCV-C3

Dehydrocannabifuran DCBF-C5

Cannabifuran CBF-C5

Cannabidiphorol (CBDP)

Tetrahydrocannabiphorol (THCP)

Biosynthesis of Cannabinoids and Cannabinoid Precursors

Aspects of the present disclosure provide tools, sequences, and methodsfor the biosynthetic production of cannabinoids in host cells. In someembodiments, the present disclosure teaches expression of enzymes thatare capable of producing cannabinoids by biosynthesis.

As a non-limiting example, one or more of the enzymes depicted in FIG. 2may be used to produce a cannabinoid or cannabinoid precursor ofinterest. FIG. 1 shows a cannabinoid biosynthesis pathway for the mostabundant phytocannabinoids found in Cannabis. See also, de Meijer et al.I, II, III, and IV (I: 2003, Genetics, 163:335-346; II: 2005, Euphytica,145:189-198; III: 2009, Euphytica, 165:293-311; and IV: 2009, Euphytica,168:95-112), and Carvalho et al. “Designing Microorganisms forHeterologous Biosynthesis of Cannabinoids” (2017) FEMS Yeast ResearchJune 1; 17(4), each of which is in this application incorporated byreference in its entirety for all purposes.

It should be appreciated that a precursor substrate for use incannabinoid biosynthesis is generally selected based on the cannabinoidof interest. Non-limiting examples of cannabinoid precursors includecompounds of Formulae 1-8 in FIG. 2. In some embodiments, polyketides,including compounds of Formula 5, could be prenylated. In certainembodiments, the precursor is a precursor compound shown in FIG. 1, 2,or 3. Substrates containing 1-40 carbon atoms are preferred. In someembodiments, substrates containing 3-8 carbon atoms are most preferred.

As used in this application, a cannabinoid or a cannabinoid precursormay comprise an R group. See, e.g., FIG. 2. In some embodiments, R maybe a hydrogen. In certain embodiments, R is optionally substitutedalkyl. In certain embodiments, R is optionally substituted C1-40 alkyl.In certain embodiments, R is optionally substituted C2-40 alkyl. Incertain embodiments, R is optionally substituted C2-40 alkyl, which isstraight chain or branched alkyl. In certain embodiments, R isoptionally substituted C3-8 alkyl. In certain embodiments, R isoptionally substituted C1-C40 alkyl, C1-C20 alkyl, C1-C10 alkyl, C1-C8alkyl, C1-C5 alkyl, C3-C5 alkyl, C3 alkyl, or C5 alkyl. In certainembodiments, R is optionally substituted C1-C20 alkyl. In certainembodiments, R is optionally substituted C1-C10 alkyl. In certainembodiments, R is optionally substituted C1-C8 alkyl. In certainembodiments, R is optionally substituted C1-C5 alkyl. In certainembodiments, R is optionally substituted C1-C7 alkyl. In certainembodiments, R is optionally substituted C3-C5 alkyl. In certainembodiments, R is optionally substituted C3 alkyl. In certainembodiments, R is unsubstituted C3 alkyl. In certain embodiments, R isn-C3 alkyl. In certain embodiments, R is n-propyl. In certainembodiments, R is n-butyl. In certain embodiments, R is n-pentyl. Incertain embodiments, R is n-hexyl. In certain embodiments, R isn-heptyl. In certain embodiments, R is of formula:

In certain embodiments, R is optionally substituted C4 alkyl. In certainembodiments, R is unsubstituted C4 alkyl. In certain embodiments, R isoptionally substituted C5 alkyl. In certain embodiments, R isunsubstituted C5 alkyl. In certain embodiments, R is optionallysubstituted C6 alkyl. In certain embodiments, R is unsubstituted C6alkyl. In certain embodiments, R is optionally substituted C7 alkyl. Incertain embodiments, R is unsubstituted C7 alkyl. In certainembodiments, R is of formula:

In certain embodiments R is of formula:

In certain embodiments, R is of formula:

In certain embodiments R is of formula:

In certain embodiments, R is of formula:

In certain embodiments, R is optionally substituted n-propyl. In certainembodiments, R is n-propyl optionally substituted with optionallysubstituted aryl. In certain embodiments, R is n-propyl optionallysubstituted with optionally substituted phenyl. In certain embodiments,R is n-propyl substituted with unsubstituted phenyl. In certainembodiments, R is optionally substituted butyl. In certain embodiments,R is optionally substituted n-butyl. In certain embodiments, R isn-butyl optionally substituted with optionally substituted aryl. Incertain embodiments, R is n-butyl optionally substituted with optionallysubstituted phenyl. In certain embodiments, R is n-butyl substitutedwith unsubstituted phenyl. In certain embodiments, R is optionallysubstituted pentyl. In certain embodiments, R is optionally substitutedn-pentyl. In certain embodiments, R is n-pentyl optionally substitutedwith optionally substituted aryl. In certain embodiments, R is n-pentyloptionally substituted with optionally substituted phenyl. In certainembodiments, R is n-pentyl substituted with unsubstituted phenyl. Incertain embodiments, R is optionally substituted hexyl. In certainembodiments, R is optionally substituted n-hexyl. In certainembodiments, R is optionally substituted n-heptyl. In certainembodiments, R is optionally substituted n-octyl. In certainembodiments, R is alkyl optionally substituted with aryl (e.g., phenyl).In certain embodiments, R is optionally substituted acyl (e.g.,—C(═O)Me).

In certain embodiments, R is optionally substituted alkenyl (e.g.,substituted or unsubstituted C₂-s alkenyl). In certain embodiments, R issubstituted or unsubstituted C₂₋₆ alkenyl. In certain embodiments, R issubstituted or unsubstituted C₂₋₅ alkenyl. In certain embodiments, R isof formula:

In certain embodiments, R is optionally substituted alkynyl (e.g.,substituted or unsubstituted C₂₋₆ alkynyl). In certain embodiments R issubstituted or unsubstituted C₂₋₆ alkynyl. In certain embodiments, R isof formula:

In certain embodiments, R is optionally substituted carbocyclyl. Incertain embodiments, R is optionally substituted aryl (e.g., phenyl ornapthyl).

The chain length of a precursor substrate can be from C1-C40. Thosesubstrates can have any degree and any kind of branching or saturationor chain structure, including, without limitation, aliphatic, alicyclic,and aromatic. In addition, they may include any functional groupsincluding hydroxy, halogens, carbohydrates, phosphates,methyl-containing or nitrogen-containing functional groups.

For example, FIG. 3 shows a non-exclusive set of putative precursors forthe cannabinoid pathway. Aliphatic carboxylic acids including four toeight total carbons (“C4”-“C8” in FIG. 3) and up to 10-12 total carbonswith either linear or branched chains may be used as precursors for theheterologous pathway. Non-limiting examples include methanoic acid,butyric acid, pentanoic acid, hexanoic acid, heptanoic acid, isovalericacid, octanoic acid, and decanoic acid. Additional precursors mayinclude ethanoic acid and propanoic acid. In some embodiments, inaddition to acids, the ester, salt, and acid forms may all be used assubstrates. Substrates may have any degree and any kind of branching,saturation, and chain structure, including, without limitation,aliphatic, alicyclic, and aromatic. In addition, they may include anyfunctional modifications or combination of modifications including,without limitation, halogenation, hydroxylation, amination, acylation,alkylation, phenylation, and/or installation of pendant carbohydrates,phosphates, sulfates, heterocycles, or lipids, or any other functionalgroups.

Substrates for any of the enzymes disclosed in this application may beprovided exogenously or may be produced endogenously by a host cell. Insome embodiments, the cannabinoids are produced from a glucosesubstrate, so that compounds of Formula 1 shown in FIG. 2 and CoAprecursors are synthesized by the cell. In other embodiments, aprecursor is fed into the reaction. In some embodiments, a precursor isa compound selected from Formulae 1-8 in FIG. 2.

Cannabinoids produced by methods disclosed in this application includerare cannabinoids. Due to the low concentrations at which rarecannabinoids occur in nature, producing industrially significant amountsof isolated or purified rare cannabinoids from the Cannabis plant maybecome prohibitive due to, e.g., the large volumes of Cannabis plants,and the large amounts of space, labor, time, and capital requirements togrow, harvest, and/or process the plant materials. The disclosureprovided in this application represents a potentially efficient methodfor producing high yields of cannabinoids, including rare cannabinoids.

Cannabinoids produced by the disclosed methods also include non-rarecannabinoids. Without being bound by a particular theory, the methodsdescribed in this application may be advantageous compared withtraditional plant-based methods for producing non-rare cannabinoids. Forexample, methods provided in this application represent potentiallyefficient means for producing consistent and high yields of non-rarecannabinoids. With traditional methods of cannabinoid production, inwhich cannabinoids are harvested from plants, maintaining consistent anduniform conditions, including airflow, nutrients, lighting, temperature,and humidity, can be difficult. For example, with plant-based methods,there can be microclimates created by branching, which can lead toinconsistent yields and by-product formation. In some embodiments, themethods described in this application are more efficient at producing acannabinoid of interest as compared to harvesting cannabinoids fromplants. For example, with plant-based methods, seed-to-harvest can takeup to half a year, while cutting-to-harvest usually takes about 4months. Additional steps including drying, curing, and extraction arealso usually needed with plant-based methods. In contrast, in someembodiments, the fermentation-based methods described in thisapplication only take about 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 days. Insome embodiments, the fermentation-based methods described in thisapplication only take about 3-5 days. In some embodiments, thefermentation-based methods described in this application only take about5 days. In some embodiments, the methods provided in this applicationreduce the amount of security needed to comply with regulatorystandards. For example, a smaller secured area may be needed to bemonitored and secured to practice the methods described in thisapplication as compared to the cultivation of plants. In someembodiments, the methods described in this application are advantageousover plant-sourced cannabinoids.

Cannabinoid Pathway Enzymes

Methods for production of cannabinoids and cannabinoid precursors caninclude expression of one or more of: an acyl activating anzyme (AAE); apolyketide synthase (PKS) (e.g., OLS); a polyketide cyclase (PKC); aprenyltransferase (PT) and a terminal synthase (TS).

Acyl Activating Enzyme (AAE)

A host cell described in this disclosure may comprise an AAE. As used inthis disclosure, an AAE refers to an enzyme that is capable ofcatalyzing the esterification between a thiol and a substrate (e.g.,optionally substituted aliphatic or aryl group) that has a carboxylicacid moiety. In some embodiments, an AAE is capable of using Formula(1):

or a salt, solvate, hydrate, polymorph, co-crystal, tautomer,stereoisomer, isotopically labeled derivative thereof to produce aproduct of Formula (2):

R is as defined in this application. In certain embodiments, R ishydrogen. In certain embodiments, R is optionally substituted alkyl. Incertain embodiments, R is optionally substituted C1-40 alkyl. In certainembodiments, R is optionally substituted C2-40 alkyl. In certainembodiments, R is optionally substituted C2-40 alkyl, which is straightchain or branched alkyl. In certain embodiments, R is optionallysubstituted C2-10 alkyl, optionally substituted C10-C20 alkyl,optionally substituted C20-C30 alkyl, optionally substituted C30-C40alkyl, or optionally substituted C40-C50 alkyl, which is straight chainor branched alkyl. In certain embodiments, R is optionally substitutedC3-8 alkyl. In certain embodiments, R is optionally substituted C1-C40alkyl, C1-C20 alkyl, C1-C10 alkyl, C1-C8 alkyl, C1-C5 alkyl, C3-C5alkyl, C3 alkyl, or C5 alkyl. In certain embodiments, R is optionallysubstituted C1-C20 alkyl. In certain embodiments, R is optionallysubstituted C1-C20 branched alkyl. In certain embodiments, R isoptionally substituted C1-C20 alkyl, optionally substituted C1-C10alkyl, optionally substituted C10-C20 alkyl, optionally substitutedC20-C30 alkyl, optionally substituted C30-C40 alkyl, or optionallysubstituted C40-C50 alkyl. In certain embodiments, R is optionallysubstituted C1-C10 alkyl. In certain embodiments, R is optionallysubstituted C3 alkyl. In certain embodiments, R is optionallysubstituted n-propyl. In certain embodiments, R is unsubstitutedn-propyl. In certain embodiments, R is optionally substituted C1-C8alkyl. In some embodiments, R is a C2-C6 alkyl. In certain embodiments,R is optionally substituted C1-C5 alkyl. In certain embodiments, R isoptionally substituted C3-C5 alkyl. In certain embodiments, R isoptionally substituted C3 alkyl. In certain embodiments, R is optionallysubstituted C5 alkyl. In certain embodiments, R is of formula:

In certain embodiments, R is of formula:

In certain embodiments, R is of formula:

In certain embodiments, R is of formula:

In certain embodiments, R is optionally substituted propyl. In certainembodiments, R is optionally substituted n-propyl. In certainembodiments, R is n-propyl optionally substituted with optionallysubstituted aryl. In certain embodiments, R is n-propyl optionallysubstituted with optionally substituted phenyl. In certain embodiments,R is n-propyl substituted with unsubstituted phenyl. In certainembodiments, R is optionally substituted butyl. In certain embodiments,R is optionally substituted n-butyl. In certain embodiments, R isn-butyl optionally substituted with optionally substituted aryl. Incertain embodiments, R is n-butyl optionally substituted with optionallysubstituted phenyl. In certain embodiments, R is n-butyl substitutedwith unsubstituted phenyl. In certain embodiments, R is optionallysubstituted pentyl. In certain embodiments, R is optionally substitutedn-pentyl. In certain embodiments, R is n-pentyl optionally substitutedwith optionally substituted aryl. In certain embodiments, R is n-pentyloptionally substituted with optionally substituted phenyl. In certainembodiments, R is n-pentyl substituted with unsubstituted phenyl. Incertain embodiments, R is optionally substituted hexyl. In certainembodiments, R is optionally substituted n-hexyl. In certainembodiments, R is optionally substituted n-heptyl. In certainembodiments, R is optionally substituted n-octyl. In certainembodiments, R is alkyl optionally substituted with aryl (e.g., phenyl).In certain embodiments, R is optionally substituted acyl (e.g.,—C(═O)Me).

In certain embodiments, R is optionally substituted alkenyl (e.g.,substituted or unsubstituted C₂₋₆ alkenyl). In certain embodiments, R issubstituted or unsubstituted C₂₋₆ alkenyl. In certain embodiments, R issubstituted or unsubstituted C₂₋₅ alkenyl. In certain embodiments, R isof formula:

In certain embodiments, R is optionally substituted alkynyl (e.g.,substituted or unsubstituted C₂₋₆ alkynyl). In certain embodiments, R issubstituted or unsubstituted C₂₋₆ alkynyl. In certain embodiments, R isof formula:

In certain embodiments, R is optionally substituted carbocyclyl. Incertain embodiments, R is optionally substituted aryl (e.g., phenyl ornapthyl).

In some embodiments, a substrate for an AAE is produced by fatty acidmetabolism within a host cell. In some embodiments, a substrate for anAAE is provided exogenously.

In some embodiments, an AAE is capable of catalyzing the formation ofhexanoyl-coenzyme A (hexanoyl-CoA) from hexanoic acid and coenzyme A(CoA). In some embodiments, an AAE is capable of catalyzing theformation of butanoyl-coenzyme A (butanoyl-CoA) from butanoic acid andcoenzyme A (CoA).

As one of ordinary skill in the art would appreciate, an AAE could beobtained from any source, including naturally occurring sources andsynthetic sources (e.g., a non-natually occurring AAE). In someembodiments, an AAE is a Cannabis enzyme. Non-limiting examples of AAEsinclude C. sativa hexanoyl-CoA synthetase 1 (CsHCS1) and C. sativahexanoyl-CoA synthetase 2 (CsHCS2) as disclosed in U.S. Pat. No.9,546,362, which is incorporated by reference in this application in itsentirety.

CsHCS1 has the sequence: (SEQ ID NO: 109)MGKNYKSLDSVVASDFIALGITSEVAETLHGRLAEIVCNYGAATPQTWINIANHILSPDLPFSLHQMLFYGCYKDFGPAPPAWIPDPEKVKSTNLGALLEKRGKEFLGVKYKDPISSFSHFQEFSVRNPEVYWRTVLMDEMKISFSKDPECILRRDDINNPGGSEWLPGGYLNSAKNCLNVNSNKKLNDTMIVWRDEGNDDLPLNKLTLDQLRKRVWLVGYALEEMGLEKGCAIAIDMPMHVDAVVIYLAIVLAGYVVVSIADSFSAPEISTRLRLSKAKAIFTQDHIIRGKKRIPLYSRVVEAKSPMAIVIPCSGSNIGAELRDGDISWDYFLERAKEFKNCEFTAREQPVDAYTNILFSSGTTGEPKAIPWTQATPLKAAADGWSHLDIRKGDVIVWPTNLGWMMGPWLVYASLLNGASIALYNGSPLVSGFAKFVQDAKVTMLGVVPSIVRSWKSTNCVSGYDWSTIRCFSSSGEASNVDEYLWLMGRANYKPVIEMCGGTEIGGAFSAGSFLQAQSLSSFSSQCMGCTLYILDKNGYPMPKNKPGIGELALGPVMFGASKTLLNGNHHDVYFKGMPTLNGEVLRRHGDIFELTSNGYYHAHGRADDTMNIGGIKISSIEIERVCNEVDDRVFETTAIGVPPLGGGPEQLVIFFVLKDSNDTTIDLNQLRLSFNLGLQKKLNPLFKVTRVVPLSSLP RTATNKIMRRVLRQFSHFE.CsHCS2 has the sequence: (SEQ ID NO: 129)MEKSGYGRDGIYRSLRPPLHLPNNNNLSMVSFLFRNSSSYPQKPALIDSETNQILSFSHFKSTVIKVSHGFLNLGIKKNDVVLIYAPNSIHFPVCFLGIIASGAIATTSNPLYTVSELSKQVKDSNPKLIITVPQLLEKVKGFNLPTILIGPDSEQESSSDKVMTFNDLVNLGGSSGSEFPIVDDFKQSDTAALLYSSGTTGMSKGVVLTHKNFIASSLMVTMEQDLVGEMDNVFLCFLP1VIFHVFGLAIITYAQLQRGNTVISMARFDLEKMLKDVEKYKVTHLWVVPPVILALSKNSMVKKFNLSSIKYIGSGAAPLGKDLMEECSKVVPYGIVAQGYGMTETCGIVSMEDIRGGKRNSGSAGMLASGVEAQIVSVDTLKPLPPNQLGEIWVKGPNMMQGYFNNPQATKLTIDKKGWVHTGDLGYFDEDGHLYVVDRIKELIKYKGFQVAPAELEGLLVSHPEILDAVVIPFPDAEAGEVPVAYVVRSPNSSLTENDVKKFIAGQVASFKRLRKVTFINSVPKSASGKILRRELIQKVRSNM.

In some embodiments, an AAE comprises a sequence that is at least 5%, atleast 10%, at least 15%, at least 20%, at least 25%, at least 30%, atleast 35%, at least 40%, at least 45%, at least 50%, at least 55%, atleast 60%, at least 65%, at least 70%, at least 71%, at least 72%, atleast 73%, at least 74%, at least 75%, at least 76%, at least 77%, atleast 78%, at least 79%, at least 80%, at least 81%, at least 82%, atleast 83%, at least 84%, at least 85%, at least 86%, at least 87%, atleast 88%, at least 89%, at least 90%, at least 91%, at least 92%, atleast 93%, at least 94%, at least 95%, at least 96%, at least 97%, atleast 98%, at least 99%, or is 100% identical, including all values inbetween, to a sequence (e.g., nucleic acid or amino acid sequence) setforth in SEQ ID NOs:63-69, 141-142, or 707-708.

In some embodiments, an AAE acts on multiple substrates, while in otherembodiments, it exhibits substrate specificity. For example, in someembodiments, an AAE exhibits substrate specificity for one or more ofhexanoic acid, butyric acid, isovaleric acid, octanoic acid, or decanoicacid. In other embodiments, an AAE exhibits activity on at least two ofhexanoic acid, butyric acid, isovaleric acid, octanoic acid, anddecanoic acid. AAE enzymes were identified herein that exhibitedactivity on butyrate and/or hexanoate (FIGS. 5 and 6). Activity onbutyrate was unexpected in view of disclosure in Carvalho et al.“Designing Microorganisms for Heterologous Biosynthesis of Cannabinoids”(2017) FEMS Yeast Research June 1; 17(4).

In some embodiments, an AAE described herein comprises: N at a residuecorresponding to position 90 in UniProtKB-Q6C577 (SEQ ID NO:64); A at aresidue corresponding to position 100 in UniProtKB-Q6C577 (SEQ IDNO:64); G at a residue corresponding to position 105 in UniProtKB-Q6C577(SEQ ID NO:64); E at a residue corresponding to position 162 inUniProtKB-Q6C577 (SEQ ID NO:64); Y at a residue corresponding toposition 195 in UniProtKB-Q6C577 (SEQ ID NO:64); G at a residuecorresponding to position 205 in UniProtKB-Q6C577 (SEQ ID NO:64); K at aresidue corresponding to position 208 in UniProtKB-Q6C577 (SEQ IDNO:64); P at a residue corresponding to position 243 in UniProtKB-Q6C577(SEQ ID NO:64); H at a residue corresponding to position 246 inUniProtKB-Q6C577 (SEQ ID NO:64); G at a residue corresponding toposition 261 in UniProtKB-Q6C577 (SEQ ID NO:64); F at a residuecorresponding to position 270 in UniProtKB-Q6C577 (SEQ ID NO:64); V at aresidue corresponding to position 284 in UniProtKB-Q6C577 (SEQ IDNO:64); L at a residue corresponding to position 289 in UniProtKB-Q6C577(SEQ ID NO:64); V at a residue corresponding to position 290 inUniProtKB-Q6C577 (SEQ ID NO:64); P at a residue corresponding toposition 291 in UniProtKB-Q6C577 (SEQ ID NO:64); P at a residuecorresponding to position 301 in UniProtKB-Q6C577 (SEQ ID NO:64); A at aresidue corresponding to position 321 in UniProtKB-Q6C577 (SEQ IDNO:64); V at a residue corresponding to position 328 in UniProtKB-Q6C577(SEQ ID NO:64); Y at a residue corresponding to position 356 inUniProtKB-Q6C577 (SEQ ID NO:64); G at a residue corresponding toposition 381 in UniProtKB-Q6C577 (SEQ ID NO:64); I at a residuecorresponding to position 391 in UniProtKB-Q6C577 (SEQ ID NO:64); P at aresidue corresponding to position 400 in UniProtKB-Q6C577 (SEQ IDNO:64); G at a residue corresponding to position 423 in UniProtKB-Q6C577(SEQ ID NO:64); Y at a residue corresponding to position 436 inUniProtKB-Q6C577 (SEQ ID NO:64); P at a residue corresponding toposition 440 in UniProtKB-Q6C577 (SEQ ID NO:64); W at a residuecorresponding to position 464 in UniProtKB-Q6C577 (SEQ ID NO:64); G at aresidue corresponding to position 468 in UniProtKB-Q6C577 (SEQ IDNO:64); D at a residue corresponding to position 469 in UniProtKB-Q6C577(SEQ ID NO:64); D at a residue corresponding to position 474 inUniProtKB-Q6C577 (SEQ ID NO:64); G at a residue corresponding toposition 477 in UniProtKB-Q6C577 (SEQ ID NO:64); D at a residuecorresponding to position 483 in UniProtKB-Q6C577 (SEQ ID NO:64); R at aresidue corresponding to position 484 in UniProtKB-Q6C577 (SEQ IDNO:64); I at a residue corresponding to position 489 in UniProtKB-Q6C577(SEQ ID NO:64); S at a residue corresponding to position 491 inUniProtKB-Q6C577 (SEQ ID NO:64); E at a residue corresponding toposition 500 in UniProtKB-Q6C577 (SEQ ID NO:64); E at a residuecorresponding to position 502 in UniProtKB-Q6C577 (SEQ ID NO:64); H at aresidue corresponding to position 508 in UniProtKB-Q6C577 (SEQ IDNO:64); V at a residue corresponding to position 511 in UniProtKB-Q6C577(SEQ ID NO:64); A at a residue corresponding to position 515 inUniProtKB-Q6C577 (SEQ ID NO:64); V at a residue corresponding toposition 516 in UniProtKB-Q6C577 (SEQ ID NO:64); G at a residuecorresponding to position 518 in UniProtKB-Q6C577 (SEQ ID NO:64); A at aresidue corresponding to position 531 in UniProtKB-Q6C577 (SEQ IDNO:64); K at a residue corresponding to position 557 in UniProtKB-Q6C577(SEQ ID NO:64); P at a residue corresponding to position 570 inUniProtKB-Q6C577 (SEQ ID NO:64); G at a residue corresponding toposition 575 in UniProtKB-Q6C577 (SEQ ID NO:64); K at a residuecorresponding to position 576 in UniProtKB-Q6C577 (SEQ ID NO:64); R at aresidue corresponding to position 580 in UniProtKB-Q6C577 (SEQ IDNO:64); and/or L at a residue corresponding to position 582 inUniProtKB-Q6C577 (SEQ ID NO:64).

In some embodiments an AAE described herein comprises one or more of: anamino acid sequence set forth as SGAAPLG (SEQ ID NO: 114); an amino acidsequence set forth as AYLGMSSGTSGG (SEQ ID NO: 115); an amino acidsequence set forth as DQPA (SEQ ID NO: 116); an amino acid sequence setforth as QVAPAELE (SEQ ID NO: 117); an amino acid sequence set forth asVVID (SEQ ID NO: 118); and/or an amino acid sequence set forth asSGKILRRLLR (SEQ ID NO: 119).

In some embodiments an AAE described herein comprises: the amino acidsequence set forth as SGAAPLG (SEQ ID NO: 114) at residues correspondingto positions 319-325 in UniProtKB-Q6C577 (SEQ ID NO:64); the amino acidsequence set forth as AYLGMSSGTSGG (SEQ ID NO: 115) at residuescorresponding to positions 194-205 in UniProtKB-Q6C577 (SEQ ID NO:64);the amino acid sequence set forth as DQPA (SEQ ID NO: 116) at residuescorresponding to positions 398-401 in UniProtKB-Q6C577 (SEQ ID NO:64);the amino acid sequence set forth as QVAPAELE (SEQ ID NO: 117) atresidues corresponding to positions 495-502 in UniProtKB-Q6C577 (SEQ IDNO:64); the amino acid sequence set forth as VVID (SEQ ID NO: 118) atresidues corresponding to positions 564-567 in UniProtKB-Q6C577 (SEQ IDNO:64); and/or the amino acid sequence set forth as SGKILRRLLR (SEQ IDNO: 119) at residues corresponding to positions 574-583 inUniProtKB-Q6C577 (SEQ ID NO:64).

In some embodiments an AAE described herein comprises: an amino acidsequence with no more than three amino acid substitutions at residuescorresponding to positions 428-440 in UniProtKB-Q6C577 (SEQ ID NO:64);or an amino acid sequence with no more than one amino acid substitutionat residues corresponding to positions 482-491 in UniProtKB-Q6C577 (SEQID NO:64).

In some embodiments an AAE described herein comprises: I or V at aresidue corresponding to position 432 in UniProtKB-Q6C577 (SEQ IDNO:64); S or D at a residue corresponding to position 434 inUniProtKB-Q6C577 (SEQ ID NO:64); K or N at a residue corresponding toposition 438 in UniProtKB-Q6C577 (SEQ ID NO:64); and/or L or M at aresidue corresponding to position 488 in UniProtKB-Q6C577 (SEQ IDNO:64).

In some embodiments an AAE described herein comprises: an amino acidsequence set forth as RGPQIMSGYHKNP (SEQ ID NO: 120); an amino acidsequence set forth as RGPQVMDGYHNNP (SEQ ID NO: 121); an amino acidsequence set forth as RGPQIMDGYHKNP (SEQ ID NO: 122); an amino acidsequence set forth as VDRTKELIKS (SEQ ID NO: 123); and/or an amino acidsequence set forth as VDRTKEMIKS (SEQ ID NO: 124).

A recombinant host cell that expresses a heterologous gene encoding anAAE described herein may be capable of producing at least 1% (e.g., atleast 5%, at least 10%, at least 15%, at least 20%, at least 25%, atleast 30%, at least 35%, at least 40%, at least 45%, at least 50%, atleast 55%, at least 60%, at least 65%, at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, at least 100%, atleast 125%, at least 150%, at least 175%, at least 200%, at least 300%,at least 400%, at least 500%, at least 600%, at least 700%, at least800%, at least 900%, or at least 1,000%) more hexanoyl-CoA and/or morebutanoyl-coenzyme A relative to a control. In some embodiments, acontrol is a host cell that does not express a heterologous geneencoding an AAE.

Polyketide Synthases (PKS)

A host cell described in this application may comprise a PKS. As used inthis application, a “PKS” refers to an enzyme that is capable ofproducing a polyketide. In certain embodiments, a PKS converts acompound of Formula (2) to a compound of Formula (4), (5), and/or (6).In certain embodiments, a PKS converts a compound of Formula (2) to acompound of Formula (4). In certain embodiments, a PKS converts acompound of Formula (2) to a compound of Formula (5). In certainembodiments, a PKS converts a compound of Formula (2) to a compound ofFormula (4) and/or (5). In certain embodiments, a PKS converts acompound of Formula (2) to a compound of Formula (5) and/or (6).

In some embodiments, a PKS is a tetraketide synthase (TKS). In certainembodiments, a PKS is an olivetol synthase (OLS). As used in thisapplication, an “OLS” refers to an enzyme that is capable of using asubstrate of Formula (2a) to form a compound of Formula (4a), (5a)and/or (6a) as shown in FIG. 1. In some embodiments, an OLS catalyzesthe formation of olivetol (Formula (5a)). In some embodiments, anolivetol synthase (OLS) catalyzes the formation of olivetol with minimalproduction of 3,5,7-trioxoalkanoyl-CoA and/or olivetolic acid. In someinstances, an OLS that is capable of catalyzing the formation ofolivetol may be useful in providing olivetol as a substrate for aprenyltransferase. As a non-limiting example, NphB can use olivetol asreactant. See, e.g., Kumano et al., Bioorg Med Chem. 2008 Sep. 1;16(17): 8117-8126.

In certain embodiments, a PKS is a divarinic acid synthase (DVS).

A non-limiting example of an OLS is provided by UniProtKB-B1Q2B6 from C.sativa. In C. sativa, this OLS uses hexanoyl-CoA and malonyl-CoA assubstrates to form 3,5,7-trioxododecanoyl-CoA. OLS (e.g.,UniProtKB-B1Q2B6) in combination with olivetolic acid cyclase (OAC)produces olivetolic acid (OA) in C. sativa.

The amino acid sequence of UniProtKB-B1Q2B6 is:

(SEQ ID NO: 5) MNHLRAEGPASVLAIGTANPENILLQDEFPDYYFRVTKSEHMTQLKEKFRKICDKSMIRKRNCFLNEEHLKQNPRLVEHEMQTLDARQDMLVVEVPKLGKDACAKAIKEWGQPKSKITHLIFTSASTTDMPGADYHCAKLLGLSPSVKRVMMYQLGCYGGGTVLRIAKDIAENNKGARVLAVCCDIMACLFRGPSESDLELLVGQAIFGDGAAAVIVGAEPDESVGERPIFELVSTGQTILPNSEGTIGGHIREAGLIFDLHKDVPMLISNNIEKCLIEAFTPIGISDWNSIFWITHPGGKAILDKVEEKLHLKSDKFVDSRHVLSEHGNIVISSSTVLFVMDELRKRSLEEGKSTTGDGFEWGVLFGFGPGLTVERVVVRSVPIKY.

Structurally, an OLS comprises a triad of conserved residues, which havebeen implicated as catalytic residues. This triad of conserved residuesmay be referred to as a catalytic triad. See, e.g., Taura et al., FEBSLetters 583 (2009) 2061-2066. The catalytic triad of UniProtKB-B1Q2B6(SEQ ID NO: 5) comprises C157, H297, and N330. One of ordinary skill inthe art would be able to identify corresponding catalytic residues inother PKSs, including OLSs, by aligning the amino acid sequence ofinterest with UniProtKB-B1Q2B6. A PKS, including an OLS, may comprisethe amino acid C at a residue corresponding to position 157 in SEQ IDNO: 5, the amino acid H at a residue corresponding to position 297 inSEQ ID NO: 5, and the amino acid N at a residue corresponding to residue330 in SEQ ID NO: 5. As a non-limiting example, the residuescorresponding to positions 157, 297, and 330 in SEQ ID NO: 5 are C164,H304, and N337, respectively in SEQ ID NO: 6. Similarly, the residuescorresponding to positions 157, 297, and 330 in SEQ ID NO: 5 are C164,H304, and N337, respectively, in SEQ ID NO: 7.

The active site of a PKS may be defined by generating thethree-dimensional structure of the PKS and identifying the residueswithin a particular distance of any of the residues within the catalytictriad and/or within a particular distance of a docked substrate withinthe PKS (e.g., a compound of Formula (2)). A substrate docks or binds inthe substrate binding pocket of a PKS. The substrate binding pocket maycomprise the active site of the PKS. As a non-limiting example, thestructure of a PKS may be generated using ROSETTA software. See, e.g.,Kaufmann et al., Biochemistry 2010, 49, 2987-2998.

As used herein, a residue is within the active site of an OLS enzyme ifit is within about 12 angstroms of any of the residues within thecatalytic triad of the OLS enzyme and/or within about 12 angstroms of adocked substrate within the OLS enzyme.

In some embodiments, a residue is within 12 angstroms (Å), within 11 Å,within 10 Å, within 9 Å, within 8 Å, within 7 Å, within 6 Å, within 5 Å,within 4 Å, within 3 Å, within 2 Å, or within 1 Å from any of theresidues within the catalytic triad (i.e., 157, 297, and 330 in SEQ IDNO: 5) and/or from a docked substrate (e.g., hexanoyl-CoA).

In some embodiments, a residue in a PKS is within 20 Å, within 19 Å,within 18 Å, within 17 Å, within 16 Å, within 15 Å, within 14 Å, within13 Å, within 12 Å, within 11 Å, within 10 Å, within 9 Å, within 8 Å,within 7 Å, within 6 Å, within 5 Å, within 4 Å, within 3 Å, within 2 Å,and/or within 1 Å from any of the residues within the catalytic triad(i.e., residues in the PKS corresponding to positions 157, 297, and 330in SEQ ID NO: 5) and/or a docked substrate.

As a non-limiting example, positions 17, 23, 25, 51, 54, 64, 95, 123,125, 153, 196, 201, 207, 241, 247, 267, 273, 277, 296, 307, 320, 324,326, 328, 334, 335, and 375 in SEQ ID NO: 5 may be located within theactive site of a PKS comprising SEQ ID NO: 5. Positions 51, 54, 123,125, 201, 207, 241, 247, 267, 273, 296, 307, 324, 326, 328, 334, 335,and 375 in SEQ ID NO: 5 may be located within about 8 Å from any of theresidues within the catalytic triad and/or a docked substrate of the PKScomprising SEQ ID NO: 5.

In some embodiments, a PKS comprises an amino acid substitution,insertion, or deletion at a residue that is within the active site ofthe PKS. In some embodiments, a PKS comprises an amino acidsubstitution, insertion, or deletion at a residue that is within 12angstroms (Å), within 11 Å, within 10 Å, within 9 Å, within 8 Å, within7 Å, within 6 Å, within 5 Å, within 4 Å, within 3 Å, within 2 Å, orwithin 1 Å away from any one of the catalytic triad residues (i.e.,positions 157, 297, and 330 in SEQ ID NO: 5) and/or from a dockedsubstrate. In some embodiments, the amino acid substitution, insertion,or deletion is at a residue corresponding to position 17, 23, 25, 51,54, 64, 95, 123, 125, 153, 196, 201, 207, 241, 247, 267, 273, 277, 296,307, 320, 324, 326, 328, 334, 335, and/or 375 in SEQ ID NO: 5. In someembodiments, a residue in a PKS corresponding to position 17, 23, 25,51, 54, 64, 95, 123, 125, 153, 196, 201, 207, 241, 247, 267, 273, 277,296, 307, 320, 324, 326, 328, 334, 335, and/or 375 in SEQ ID NO: 5 islocated within 12 Å from the active site of the PKS. In someembodiments, a residue in a PKS corresponding to position 51, 54, 123,125, 201, 207, 241, 247, 267, 273, 296, 307, 324, 326, 328, 334, 335,and/or 375 in SEQ ID NO: 5 is located within 8 Å from the active site ofthe PKS. In some embodiments, the PKS comprises one or more of: T17K,I23C, L25R, K51R, D54R, F64Y, V95A, T123C, A125S, Y153G, E196K, L201C,I207L, L241I, T247A, M267K, M267G, I273V, L277M, T296A, V307I, D320A,V324I, S326R, H328Y, S334P, S334A, T335C, and/or R375T relative to SEQID NO: 5. In some embodiments, a host cell comprising one or more ofthese amino acid substitutions relative to SEQ ID NO: 5 is capable ofproducing at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 ormore than 15 mg/L Olivetol, including all values in between.

In some embodiments, a PKS comprises: an amino acid substitution,insertion, or deletion at a residue that is more than 12 angstroms (Å),more than 11 Å, more than 10 Å, more than 9 Å, more than 8 Å, more than7 Å, more than 6 Å, more than 5 Å, more than 4 Å, more than 3 Å, morethan 2 Å, or more than 1 Å away from the catalytic triad (i.e., 157,297, and 330 in SEQ ID NO: 5) and/or from a docked substrate. In someembodiments, the residue corresponds to position 71, 92, 100, 108, 116,128, 135, 229, 278, 284, and/or 348 in SEQ ID NO: 5. In someembodiments, the residue in a PKS corresponding to position 71, 92, 100,108, 116, 128, 135, 229, 278, 284, and/or 348 in SEQ ID NO: 5 is morethan 12A from the active site of the PKS. In some embodiments, the PKScomprises one or more of: I284Y, K100L, K116R, I278E, K108D, L348S,K71R, V92G, T128V, K100M, Y135V, P229A, T128A, and/or T128I. In someembodiments, a host cell comprising one or more of these mutations iscapable of producing at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,14, 15 or more than 15 mg/L Olivetol, including all values in between.

In some embodiments, a PKS comprises the amino acid C at a residuecorresponding to position 335 of SEQ ID NO: 5. In some embodiments, aPKS comprises the amino acid substitution T335C relative to a control.In some embodiments, the control is a PKS comprising SEQ ID NO: 5. Insome embodiments, a PKS comprises a sequence at least 5%, at least 10%,at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, atleast 40%, at least 45%, at least 50%, at least 55%, at least 60%, atleast 65%, at least 70%, at least 71%, at least 72%, at least 73%, atleast 74%, at least 75%, at least 76%, at least 77%, at least 78%, atleast 79%, at least 80%, at least 81%, at least 82%, at least 83%, atleast 84%, at least 85%, at least 86%, at least 87%, at least 88%, atleast 89%, at least 90%, at least 91%, at least 92%, at least 93%, atleast 94%, at least 95%, at least 96%, at least 97%, at least 98%, atleast 99%, or is 100% identical, including all values in between, to asequence (e.g., amino acid or nucleic sequence) set forth in SEQ ID NOs:38, 172, 175, 176, 196, 204, 205, 7, 17, 145, 13, 8, and 15. In someembodiments, a PKS comprises a sequence at most 5%, at most 10%, at most15%, at most 20%, at most 25%, at most 30%, at most 35%, at most 40%, atmost 45%, at most 50%, at most 55%, at most 60%, at most 65%, at most70%, at most 71%, at most 72%, at most 73%, at most 74%, at most 75%, atmost 76%, at most 77%, at most 78%, at most 79%, at most 80%, at most81%, at most 82%, at most 83%, at most 84%, at most 85%, at most 86%, atmost 87%, at most 88%, at most 89%, at most 90%, at most 91%, at most92%, at most 93%, at most 94%, at most 95%, at most 96%, at most 97%, atmost 98%, at most 99%, or is 100% identical, including all values inbetween, to a sequence (e.g., amino acid or nucleic sequence) set forthin SEQ ID NOs: 38, 172, 175, 176, 196, 204, 205, 7, 17, 145, 13, 8, and15.

In some embodiments, a PKS described herein comprises a sequence that isat least 5%, at least 10%, at least 15%, at least 20%, at least 25%, atleast 30%, at least 35%, at least 40%, at least 45%, at least 50%, atleast 55%, at least 60%, at least 65%, at least 70%, at least 71%, atleast 72%, at least 73%, at least 74%, at least 75%, at least 76%, atleast 77%, at least 78%, at least 79%, at least 80%, at least 81%, atleast 82%, at least 83%, at least 84%, at least 85%, at least 86%, atleast 87%, at least 88%, at least 89%, at least 90%, at least 91%, atleast 92%, at least 93%, at least 94%, at least 95%, at least 96%, atleast 97%, at least 98%, at least 99%, or is 100% identical, includingall values in between, to a sequence (e.g., nucleic acid or amino acidsequence) set forth in UniProtKB-A0A088G5Z5 (SEQ ID NO: 7), SEQ ID NO:714, SEQ ID NO: 715, or SEQ ID NO: 38.

In some embodiments, relative to the sequence of SEQ ID NO: 7, the PKScomprises an amino acid substitution at a residue corresponding toposition 28, 34, 50, 70, 71, 76, 88, 100, 151, 203, 219, 285, 359,and/or 385 in SEQ ID NO: 7. In some embodiments, the PKS comprises: theamino acid P at a residue corresponding to position 28 in SEQ ID NO: 7;the amino acid Q at a residue corresponding to position 34 in SEQ ID NO:7; the amino acid N at a residue corresponding to position 50 in SEQ IDNO: 7; the amino acid M at a residue corresponding to position 70 in SEQID NO: 7; the amino acid Y at a residue corresponding to position 71 inSEQ ID NO: 7; the amino acid I at a residue corresponding to position 76in SEQ ID NO: 7; the amino acid A at a residue corresponding to position88 in SEQ ID NO: 7; the amino acid P or T at a residue corresponding toposition 100 in SEQ ID NO: 7; the amino acid P at a residuecorresponding to position 151 in SEQ ID NO: 7; the amino acid K at aresidue corresponding to position 203 in SEQ ID NO: 7; the amino acid Cat a residue corresponding to position 219 in SEQ ID NO: 7; the aminoacid A at a residue corresponding to position 285 in SEQ ID NO: 7; theamino acid M at a residue corresponding to position 359 in SEQ ID NO: 7;and/or the amino acid M at a residue corresponding to position 385 inSEQ ID NO: 7. In some embodiments, the PKS comprises one or more of thefollowing amino acid substitutions relative to SEQ ID NO: 7: E28P, S34Q,V50N, F70M, V71Y, L76I, D88A, R100P, R100T, N151P, E203K, A219C, E285A,K359M, and/or L385M. In some embodiments, the PKS comprises V71Y and/orF70M. In some embodiments, the PKS comprises C at a residuecorresponding to position 164 in SEQ ID NO: 7; H at a residuecorresponding to position 304 in SEQ ID NO: 7; and/or N at a residuecorresponding to position 337 in SEQ ID NO: 7.

In some embodiments, a host cell with a PKS that comprises an amino acidsubstitution at a residue corresponding to position to position 28, 34,50, 70, 71, 76, 88, 100, 151, 203, 219, 285, 359, and/or 385 in SEQ IDNO: 7 produces at least 1% (e.g., at least 5%, at least 10%, at least15%, at least 20%, at least 25%, at least 30%, at least 35%, at least40%, at least 45%, at least 50%, at least 55%, at least 60%, at least65%, at least 70%, at least 75%, at least 80%, at least 85%, at least90%, at least 95%, at least 100%, at least 125%, at least 150%, at least175%, at least 200%, at least 300%, at least 400%, at least 500%, atleast 600%, at least 700%, at least 800%, at least 900%, or at least1,000%) more of a product (e.g., a compound of Formula (4), (5), and/or(6)) relative to a host cell comprising SEQ ID NO: 7.

In some embodiments, a PKS described herein comprises: A at a residuecorresponding to position 17 in UniProtKB-A0A1R3HSU5 (SEQ ID NO:6); A ata residue corresponding to position 21 in UniProtKB-A0A1R3HSU5 (SEQ IDNO:6); I at a residue corresponding to position 22 inUniProtKB-A0A1R3HSU5 (SEQ ID NO:6); G at a residue corresponding toposition 23 in UniProtKB-A0A1R3HSU5 (SEQ ID NO: 6); Q at a residuecorresponding to position 33 in UniProtKB-A0A1R3HSU5 (SEQ ID NO: 6); Dat a residue corresponding to position 38 in UniProtKB-A0A1R3HSU5 (SEQID NO: 6); F at a residue corresponding to position 41 inUniProtKB-A0A1R3HSU5 (SEQ ID NO:6); L at a residue corresponding toposition 52 in UniProtKB-A0A1R3HSU5 (SEQ ID NO:6); K at a residuecorresponding to position 55 in UniProtKB-A0A1R3HSU5 (SEQ ID NO:6); C ata residue corresponding to position 60 in UniProtKB-A0A1R3HSU5 (SEQ IDNO:6); R at a residue corresponding to position 68 inUniProtKB-A0A1R3HSU5 (SEQ ID NO:6); R at a residue corresponding toposition 94 in UniProtKB-A0A1R3HSU5 (SEQ ID NO:6); A at a residuecorresponding to position 109 in UniProtKB-A0A1R3HSU5 (SEQ ID NO:6); Aat a residue corresponding to position 113 in UniProtKB-A0A1R3HSU5 (SEQID NO:6); W at a residue corresponding to position 117 inUniProtKB-A0A1R3HSU5 (SEQ ID NO:6); G at a residue corresponding toposition 118 in UniProtKB-A0A1R3HSU5 (SEQ ID NO:6); S at a residuecorresponding to position 122 in UniProtKB-A0A1R3HSU5 (SEQ ID NO:6); Iat a residue corresponding to position 124 in UniProtKB-A0A1R3HSU5 (SEQID NO:6); T at a residue corresponding to position 125 inUniProtKB-A0A1R3HSU5 (SEQ ID NO:6); H at a residue corresponding toposition 126 in UniProtKB-A0A1R3HSU5 (SEQ ID NO:6); P at a residuecorresponding to position 138 in UniProtKB-A0A1R3HSU5 (SEQ ID NO:6); Dat a residue corresponding to position 141 in UniProtKB-A0A1R3HSU5 (SEQID NO:6); L at a residue corresponding to position 150 inUniProtKB-A0A1R3HSU5 (SEQ ID NO:6); G at a residue corresponding toposition 163 in UniProtKB-A0A1R3HSU5 (SEQ ID NO:6); C at a residuecorresponding to position 164 in UniProtKB-A0A1R3HSU5 (SEQ ID NO:6); Gat a residue corresponding to position 168 in UniProtKB-A0A1R3HSU5 (SEQID NO:6); R at a residue corresponding to position 172 inUniProtKB-A0A1R3HSU5 (SEQ ID NO:6); K at a residue corresponding toposition 175 in UniProtKB-A0A1R3HSU5 (SEQ ID NO:6); E at a residuecorresponding to position 179 in UniProtKB-A0A1R3HSU5 (SEQ ID NO:6); Rat a residue corresponding to position 185 in UniProtKB-A0A1R3HSU5 (SEQID NO:6); L at a residue corresponding to position 187 inUniProtKB-A0A1R3HSU5 (SEQ ID NO:6); V at a residue corresponding toposition 189 in UniProtKB-A0A1R3HSU5 (SEQ ID NO:6); C at a residuecorresponding to position 190 in UniProtKB-A0A1R3HSU5 (SEQ ID NO:6); Pat a residue corresponding to position 201 in UniProtKB-A0A1R3HSU5 (SEQID NO:6); F at a residue corresponding to position 215 inUniProtKB-A0A1R3HSU5 (SEQ ID NO:6); D at a residue corresponding toposition 217 in UniProtKB-A0A1R3HSU5 (SEQ ID NO:6); G at a residuecorresponding to position 218 in UniProtKB-A0A1R3HSU5 (SEQ ID NO:6); Gat a residue corresponding to position 225 in UniProtKB-A0A1R3HSU5 (SEQID NO:6); E at a residue corresponding to position 234 inUniProtKB-A0A1R3HSU5 (SEQ ID NO:6); G at a residue corresponding toposition 263 in UniProtKB-A0A1R3HSU5 (SEQ ID NO:6); P at a residuecorresponding to position 273 in UniProtKB-A0A1R3HSU5 (SEQ ID NO:6); Fat a residue corresponding to position 288 in UniProtKB-A0A1R3HSU5 (SEQID NO:6); D at a residue corresponding to position 295 inUniProtKB-A0A1R3HSU5 (SEQ ID NO:6); N at a residue corresponding toposition 297 in UniProtKB-A0A1R3HSU5 (SEQ ID NO:6); F at a residuecorresponding to position 300 in UniProtKB-A0A1R3HSU5 (SEQ ID NO:6); Hat a residue corresponding to position 304 in UniProtKB-A0A1R3HSU5 (SEQID NO:6); G at a residue corresponding to position 306 inUniProtKB-A0A1R3HSU5 (SEQ ID NO:6); G at a residue corresponding toposition 307 in UniProtKB-A0A1R3HSU5 (SEQ ID NO:6); L at a residuecorresponding to position 311 in UniProtKB-A0A1R3HSU5 (SEQ ID NO:6); Gat a residue corresponding to position 336 in UniProtKB-A0A1R3HSU5 (SEQID NO:6); N at a residue corresponding to position 337 inUniProtKB-A0A1R3HSU5 (SEQ ID NO:6); M at a residue corresponding toposition 338 in UniProtKB-A0A1R3HSU5 (SEQ ID NO:6); V at a residuecorresponding to position 343 in UniProtKB-A0A1R3HSU5 (SEQ ID NO:6); Dat a residue corresponding to position 348 in UniProtKB-A0A1R3HSU5 (SEQID NO:6); R at a residue corresponding to position 351 inUniProtKB-A0A1R3HSU5 (SEQ ID NO:6); G at a residue corresponding toposition 363 in UniProtKB-A0A1R3HSU5 (SEQ ID NO:6); G at a residuecorresponding to position 365 in UniProtKB-A0A1R3HSU5 (SEQ ID NO:6); Gat a residue corresponding to position 369 in UniProtKB-A0A1R3HSU5 (SEQID NO:6); G at a residue corresponding to position 375 inUniProtKB-A0A1R3HSU5 (SEQ ID NO: 6); P at a residue corresponding toposition 376 in UniProtKB-A0A1R3HSU5 (SEQ ID NO:6); G at a residuecorresponding to position 377 in UniProtKB-A0A1R3HSU5 (SEQ ID NO:6); Eat a residue corresponding to position 381 in UniProtKB-A0A1R3HSU5 (SEQID NO:6); and/or S at a residue corresponding to position 387 inUniProtKB-A0A1R3HSU5 (SEQ ID NO:6).

In some embodiments, a PKS described herein comprises: S, T, or G at aresidue corresponding to position 18 in UniProtKB-A0A1R3HSU5 (SEQ IDNO:6); V or I at a residue corresponding to position 19 inUniProtKB-A0A1R3HSU5 (SEQ ID NO:6); E, P, S, A, or D at a residuecorresponding to position 28 in UniProtKB-A0A1R3HSU5 (SEQ ID NO:6); I,C, I, S, F, Y, Q, H, A, or V at a residue corresponding to position 30in UniProtKB-A0A1R3HSU5 (SEQ ID NO:6); D, S, C, I, A, or D at a residuecorresponding to position 34 in UniProtKB-A0A1R3HSU5 (SEQ ID NO:6); F orY at a residue corresponding to position 36 in UniProtKB-A0A1R3HSU5 (SEQID NO:6); Y, F, or V at a residue corresponding to position 39 inUniProtKB-A0A1R3HSU5 (SEQ ID NO:6); K, N, D, or S at a residuecorresponding to position 45 in UniProtKB-A0A1R3HSU5 (SEQ ID NO:6); K,R, or H at a residue corresponding to position 58 inUniProtKB-A0A1R3HSU5 (SEQ ID NO:6); F, H, V, Y, or N at a residuecorresponding to position 71 in UniProtKB-A0A1R3HSU5 (SEQ ID NO:6); R,N, C, E, S, or H at a residue corresponding to position 82 inUniProtKB-A0A1R3HSU5 (SEQ ID NO:6); M, A, D, S, E, or V at a residuecorresponding to position 88 in UniProtKB-A0A1R3HSU5 (SEQ ID NO:6); Q,P, S, N, L, or K at a residue corresponding to position 89 inUniProtKB-A0A1R3HSU5 (SEQ ID NO:6); T or S at a residue corresponding toposition 90 in UniProtKB-A0A1R3HSU5 (SEQ ID NO:6); M, I, F, L, or V at aresidue corresponding to position 97 in UniProtKB-A0A1R3HSU5 (SEQ IDNO:6); D, E, K, or A at a residue corresponding to position 108 inUniProtKB-A0A1R3HSU5 (SEQ ID NO:6); C, A, or S at a residuecorresponding to position 110 in UniProtKB-A0A1R3HSU5 (SEQ ID NO:6); T,C, or Y at a residue corresponding to position 130 inUniProtKB-A0A1R3HSU5 (SEQ ID NO:6); S or T at a residue corresponding toposition 131 in UniProtKB-A0A1R3HSU5 (SEQ ID NO:6); A, S, T, or I at aresidue corresponding to position 132 in UniProtKB-A0A1R3HSU5 (SEQ IDNO:6); L, Q, or H at a residue corresponding to position 162 inUniProtKB-A0A1R3HSU5 (SEQ ID NO:6); G, A, or S at a residuecorresponding to position 166 in UniProtKB-A0A1R3HSU5 (SEQ ID NO:6); I,L, V, T, M, or Y at a residue corresponding to position 173 inUniProtKB-A0A1R3HSU5 (SEQ ID NO:6); I, L, F, or V at a residuecorresponding to position 177 in UniProtKB-A0A1R3HSU5 (SEQ ID NO:6); C,S, or A at a residue corresponding to position 191 inUniProtKB-A0A1R3HSU5 (SEQ ID NO:6); D or E at a residue corresponding toposition 192 in UniProtKB-A0A1R3HSU5 (SEQ ID NO:6); M or T at a residuecorresponding to position 194 in UniProtKB-A0A1R3HSU5 (SEQ ID NO:6); L,C, T, S, M, or N at a residue corresponding to position 197 inUniProtKB-A0A1R3HSU5 (SEQ ID NO:6); E or D at a residue corresponding toposition 207 in UniProtKB-A0A1R3HSU5 (SEQ ID NO:6); V, L, M, or I at aresidue corresponding to position 222 in UniProtKB-A0A1R3HSU5 (SEQ IDNO:6); I, L, C, S, V, or M at a residue corresponding to position 237 inUniProtKB-A0A1R3HSU5 (SEQ ID NO:6); T, A, N, or S at a residuecorresponding to position 243 in UniProtKB-A0A1R3HSU5 (SEQ ID NO:6); N,D, E, or G at a residue corresponding to position 250 inUniProtKB-A0A1R3HSU5 (SEQ ID NO:6); I or L at a residue corresponding toposition 299 in UniProtKB-A0A1R3HSU5 (SEQ ID NO:6); K, P, or R at aresidue corresponding to position 308 in UniProtKB-A0A1R3HSU5 (SEQ IDNO:6); F, L, or M at a residue corresponding to position 325 inUniProtKB-A0A1R3HSU5 (SEQ ID NO:6); H or Y at a residue corresponding toposition 335 in UniProtKB-A0A1R3HSU5; M or L at a residue correspondingto position 347 in UniProtKB-A0A1R3HSU5 (SEQ ID NO:6); L, M, I, or T ata residue corresponding to position 350 in UniProtKB-A0A1R3HSU5 (SEQ IDNO:6); F, L, or M at a residue corresponding to position 366 inUniProtKB-A0A1R3HSU5 (SEQ ID NO:6); and/or R or T at a residuecorresponding to position 382 in UniProtKB-A0A1R3HSU5 (SEQ ID NO:6).

In some embodiments, a PKS comprises a sequence that is at least 5%, atleast 10%, at least 15%, at least 20%, at least 25%, at least 30%, atleast 35%, at least 40%, at least 45%, at least 50%, at least 55%, atleast 60%, at least 65%, at least 70%, at least 71%, at least 72%, atleast 73%, at least 74%, at least 75%, at least 76%, at least 77%, atleast 78%, at least 79%, at least 80%, at least 81%, at least 82%, atleast 83%, at least 84%, at least 85%, at least 86%, at least 87%, atleast 88%, at least 89%, at least 90%, at least 91%, at least 92%, atleast 93%, at least 94%, at least 95%, at least 96%, at least 97%, atleast 98%, at least 99%, or is 100% identical, including all values inbetween, to a sequence (e.g., nucleic acid or amino acid sequence) setforth in SEQ ID NOs: 1-31, 77-92, 143-171, 207-249, 293-420, 549-627,32-62, 93-108, 172-206, 250-292, 421-548, 628-705, and 706 or to asequence selected from Tables 5-6 and 13-16.

In some embodiments, a PKS comprises at least 1, at least 2, at least 3,at least 4, at least 5, at least 6, at least 7, at least 8, at least 9,at least 10, at least 11, at least 12, at least 13, at least 14, atleast 15, at least 16, at least 17, at least 18, at least 19, at least20, at least 21, at least 22, at least 23, at least 24, at least 25, atleast 26, at least 27, at least 28, at least 29, at least 30, at least31, at least 32, at least 33, at least 34, at least 35, at least 36, atleast 37, at least 38, at least 39, at least 40, at least 41, at least42, at least 43, at least 44, at least 45, at least 46, at least 47, atleast 48, at least 49, at least 50, at least 51, at least 52, at least53, at least 54, at least 55, at least 56, at least 57, at least 58, atleast 59, at least 60, at least 61, at least 62, at least 63, at least64, at least 65, at least 66, at least 67, at least 68, at least 69, atleast 70, at least 71, at least 72, at least 73, at least 74, at least75, at least 76, at least 77, at least 78, at least 79, at least 80, atleast 81, at least 82, at least 83, at least 84, at least 85, at least86, at least 87, at least 88, at least 89, at least 90, at least 91, atleast 92, at least 93, at least 94, at least 95, at least 96, at least97, at least 98, at least 99, at least 100, at least 101, at least 102,at least 103, at least 104, at least 105, at least 106, at least 107, atleast 108, at least 109, at least 110, at least 111, at least 112, atleast 113, at least 114, at least 115, at least 116, at least 117, atleast 118, at least 119, at least 120, at least 121, at least 122, atleast 123, at least 124, at least 125, at least 126, at least 127, atleast 128, at least 129, at least 130, at least 131, at least 132, atleast 133, at least 134, at least 135, at least 136, at least 137, atleast 138, at least 139, at least 140, at least 141, at least 142, atleast 143, at least 144, at least 145, at least 146, at least 147, atleast 148, at least 149, at least 150, at least 151, at least 152, atleast 153, at least 154, at least 155, at least 156, at least 157, atleast 158, at least 159, at least 160, at least 161, at least 162, atleast 163, at least 164, at least 165, at least 166, at least 167, atleast 168, at least 169, at least 170, at least 171, at least 172, atleast 173, at least 174, at least 175, at least 176, at least 177, atleast 178, at least 179, at least 180, at least 181, at least 182, atleast 183, at least 184, at least 185, at least 186, at least 187, atleast 188, at least 189, at least 190, at least 191, at least 192, atleast 193, at least 194, at least 195, at least 196, at least 197, atleast 198, at least 199, at least 200, at least 201, at least 202, atleast 203, at least 204, at least 205, at least 206, at least 207, atleast 208, at least 209, at least 210, at least 211, at least 212, atleast 213, at least 214, at least 215, at least 216, at least 217, atleast 218, at least 219, at least 220, at least 221, at least 222, atleast 223, at least 224, at least 225, at least 226, at least 227, atleast 228, at least 229, at least 230, at least 231, at least 232, atleast 233, at least 234, at least 235, at least 236, at least 237, atleast 238, at least 239, at least 240, at least 241, at least 242, atleast 243, at least 244, at least 245, at least 246, at least 247, atleast 248, at least 249, at least 250, at least 251, at least 252, atleast 253, at least 254, at least 255, at least 256, at least 257, atleast 258, at least 259, at least 260, at least 261, at least 262, atleast 263, at least 264, at least 265, at least 266, at least 267, atleast 268, at least 269, at least 270, at least 271, at least 272, atleast 273, at least 274, at least 275, at least 276, at least 277, atleast 278, at least 279, at least 280, at least 281, at least 282, atleast 283, at least 284, at least 285, at least 286, at least 287, atleast 288, at least 289, at least 290, at least 291, at least 292, atleast 293, at least 294, at least 295, at least 296, at least 297, atleast 298, at least 299, at least 300, at least 301, at least 302, atleast 303, at least 304, at least 305, at least 306, at least 307, atleast 308, at least 309, at least 310, at least 311, at least 312, atleast 313, at least 314, at least 315, at least 316, at least 317, atleast 318, at least 319, at least 320, at least 321, at least 322, atleast 323, at least 324, at least 325, at least 326, at least 327, atleast 328, at least 329, at least 330, at least 331, at least 332, atleast 333, at least 334, at least 335, at least 336, at least 337, atleast 338, at least 339, at least 340, at least 341, at least 342, atleast 343, at least 344, at least 345, at least 346, at least 347, atleast 348, at least 349, at least 350, at least 351, at least 352, atleast 353, at least 354, at least 355, at least 356, at least 357, atleast 358, at least 359, at least 360, at least 361, at least 362, atleast 363, at least 364, at least 365, at least 366, at least 367, atleast 368, at least 369, at least 370, at least 371, at least 372, atleast 373, at least 374, at least 375, at least 376, at least 377, atleast 378, at least 379, or at least 380 amino acid substitutions,deletions, or insertions relative to SEQ ID NOs: 1-31, 77-92, 143-171,207-249, 293-420, and 549-627 or to an amino acid sequence selected fromTables 5-6 and 13-16.

In some embodiments, a PKS comprises at most 1, at most 2, at most 3, atmost 4, at most 5, at most 6, at most 7, at most 8, at most 9, at most10, at most 11, at most 12, at most 13, at most 14, at most 15, at most16, at most 17, at most 18, at most 19, at most 20, at most 21, at most22, at most 23, at most 24, at most 25, at most 26, at most 27, at most28, at most 29, at most 30, at most 31, at most 32, at most 33, at most34, at most 35, at most 36, at most 37, at most 38, at most 39, at most40, at most 41, at most 42, at most 43, at most 44, at most 45, at most46, at most 47, at most 48, at most 49, at most 50, at most 51, at most52, at most 53, at most 54, at most 55, at most 56, at most 57, at most58, at most 59, at most 60, at most 61, at most 62, at most 63, at most64, at most 65, at most 66, at most 67, at most 68, at most 69, at most70, at most 71, at most 72, at most 73, at most 74, at most 75, at most76, at most 77, at most 78, at most 79, at most 80, at most 81, at most82, at most 83, at most 84, at most 85, at most 86, at most 87, at most88, at most 89, at most 90, at most 91, at most 92, at most 93, at most94, at most 95, at most 96, at most 97, at most 98, at most 99, at most100, at most 101, at most 102, at most 103, at most 104, at most 105, atmost 106, at most 107, at most 108, at most 109, at most 110, at most111, at most 112, at most 113, at most 114, at most 115, at most 116, atmost 117, at most 118, at most 119, at most 120, at most 121, at most122, at most 123, at most 124, at most 125, at most 126, at most 127, atmost 128, at most 129, at most 130, at most 131, at most 132, at most133, at most 134, at most 135, at most 136, at most 137, at most 138, atmost 139, at most 140, at most 141, at most 142, at most 143, at most144, at most 145, at most 146, at most 147, at most 148, at most 149, atmost 150, at most 151, at most 152, at most 153, at most 154, at most155, at most 156, at most 157, at most 158, at most 159, at most 160, atmost 161, at most 162, at most 163, at most 164, at most 165, at most166, at most 167, at most 168, at most 169, at most 170, at most 171, atmost 172, at most 173, at most 174, at most 175, at most 176, at most177, at most 178, at most 179, at most 180, at most 181, at most 182, atmost 183, at most 184, at most 185, at most 186, at most 187, at most188, at most 189, at most 190, at most 191, at most 192, at most 193, atmost 194, at most 195, at most 196, at most 197, at most 198, at most199, at most 200, at most 201, at most 202, at most 203, at most 204, atmost 205, at most 206, at most 207, at most 208, at most 209, at most210, at most 211, at most 212, at most 213, at most 214, at most 215, atmost 216, at most 217, at most 218, at most 219, at most 220, at most221, at most 222, at most 223, at most 224, at most 225, at most 226, atmost 227, at most 228, at most 229, at most 230, at most 231, at most232, at most 233, at most 234, at most 235, at most 236, at most 237, atmost 238, at most 239, at most 240, at most 241, at most 242, at most243, at most 244, at most 245, at most 246, at most 247, at most 248, atmost 249, at most 250, at most 251, at most 252, at most 253, at most254, at most 255, at most 256, at most 257, at most 258, at most 259, atmost 260, at most 261, at most 262, at most 263, at most 264, at most265, at most 266, at most 267, at most 268, at most 269, at most 270, atmost 271, at most 272, at most 273, at most 274, at most 275, at most276, at most 277, at most 278, at most 279, at most 280, at most 281, atmost 282, at most 283, at most 284, at most 285, at most 286, at most287, at most 288, at most 289, at most 290, at most 291, at most 292, atmost 293, at most 294, at most 295, at most 296, at most 297, at most298, at most 299, at most 300, at most 301, at most 302, at most 303, atmost 304, at most 305, at most 306, at most 307, at most 308, at most309, at most 310, at most 311, at most 312, at most 313, at most 314, atmost 315, at most 316, at most 317, at most 318, at most 319, at most320, at most 321, at most 322, at most 323, at most 324, at most 325, atmost 326, at most 327, at most 328, at most 329, at most 330, at most331, at most 332, at most 333, at most 334, at most 335, at most 336, atmost 337, at most 338, at most 339, at most 340, at most 341, at most342, at most 343, at most 344, at most 345, at most 346, at most 347, atmost 348, at most 349, at most 350, at most 351, at most 352, at most353, at most 354, at most 355, at most 356, at most 357, at most 358, atmost 359, at most 360, at most 361, at most 362, at most 363, at most364, at most 365, at most 366, at most 367, at most 368, at most 369, atmost 370, at most 371, at most 372, at most 373, at most 374, at most375, at most 376, at most 377, at most 378, at most 379, or at most 380amino acid substitutions, deletions, or insertions relative to 1-31,77-92, 143-171, 207-249, 293-420, and 549-627 or to an amino acidsequence selected from Tables 5-6 and 13-16.

As one of ordinary skill in the art would appreciate a PKS, such as anOLS, could be obtained from any source, including naturally occurringsources and synthetic sources (e.g., a non-natually occurring PKS). Insome embodiments a PKS is from Cannabis. In some embodiments a PKS isfrom Dictyostelium. Non-limiting examples of PKS enzymes may be found inU.S. Pat. No. 6,265,633; WO2019/202510; WO 2018/148848 A1; WO2018/148849 A1; and US 2018/155748 (granted as U.S. Pat. No.10,435,727), which are incorporated by reference in this application intheir entireties. For example, PKSs include SEQ ID NO: 2 fromWO2019/202510, SEQ ID NO: 9 from WO2019/202510, SEQ ID NO: 37 from WO2018/148848 A1, SEQ ID NO: 38 from WO 2018/148848 A1, SEQ ID NO: 9 fromWO 2018/148849 A1, SEQ ID NO: 10 from WO 2018/148849 A1, SEQ ID NO: 13from WO 2018/148849 A1; and SEQ ID NO: 35 from U.S. Pat. No. 10,435,727.

In certain embodiments, polyketide synthases can use hexanoyl-CoA or anyacyl-CoA (or a product of Formula (2)):

and three malonyl-CoAs as substrates to form 3,5,7-trioxododecanoyl-CoAor other 3,5,7-trioxo-acyl-CoA derivatives; or to form a compound ofFormula (4):

wherein R is hydrogen, optionally substituted acyl, optionallysubstituted alkyl, optionally substituted alkenyl, optionallysubstituted alkynyl, optionally substituted carbocyclyl, or optionallysubstituted aryl; depending on substrate. R is as defined in thisapplication. In some embodiments, R is a C2-C6 optionally substitutedalkyl. In some embodiments, R is a propyl or pentyl. In someembodiments, R is pentyl. In some embodiments, R is propyl. A PKS mayalso bind isovaleryl-CoA, octanoyl-CoA, hexanoyl-CoA, and butyryl-CoA.In some embodiments, a PKS is capable of catalyzing the formation of a3,5,7-trioxoalkanoyl-CoA (e.g. 3,5,7-trioxododecanoyl-CoA). In someembodiments, an OLS is capable of catalyzing the formation of a3,5,7-trioxoalkanoyl-CoA (e.g. 3,5,7-trioxododecanoyl-CoA).

In some embodiments, a PKS uses a substrate of Formula (2) to form acompound of Formula (4):

wherein R is unsubstituted pentyl.

A recombinant host cell that expresses a heterologous gene encoding anPKS described herein may be capable of producing at least 1% (e.g., atleast 5%, at least 10%, at least 15%, at least 20%, at least 25%, atleast 30%, at least 35%, at least 40%, at least 45%, at least 50%, atleast 55%, at least 60%, at least 65%, at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, at least 100%, atleast 125%, at least 150%, at least 175%, at least 200%, at least 300%,at least 400%, at least 500%, at least 600%, at least 700%, at least800%, at least 900%, or at least 1,000%) more of a product (e.g., acompound of Formula (4), (5), and/or (6)) relative to a control. In someembodiments, a compound of Formula (4) is a compound of Formula (4a), acompound of Formula (5) is a compound of Formula (5a), and a compound ofFormula (6) is a compound of Formula (6a). In some embodiments, acontrol is a recombinant host cell that expresses a heterologous geneencoding UniProtKB-B1Q2B6. In some embodiments, a control is arecombinant host cell that expresses a heterologous gene encoding awild-type PKS.

A recombinant host cell that expresses a heterologous gene encoding anPKS described herein may be capable of producing at least 0.5 mg/L, atleast 1 mg/L, at least 1.5 mg/L, at least 2 mg/L, at least 2.5 mg/L, atleast 3 mg/L, at least 3.5 mg/L, at least 4 mg/L, at least 4.5 mg/L, atleast 5 mg/L, at least 5.5 mg/L, at least 6 mg/L, at least 6.5 mg/L, atleast 7 mg/L, at least 7.5 mg/L, at least 8 mg/L, at least 8.5 mg/L, atleast 9 mg/L, at least 9.5 mg/L, at least 10 mg/L, at least 10.5 mg/L,at least 11 mg/L, at least 11.5 mg/L, at least 12 mg/L, at least 12.5mg/L, at least 13 mg/L, at least 13.5 mg/L, at least 14 mg/L, at least14.5 mg/L, at least 15 mg/L, at least 15.5 mg/L, at least 16 mg/L, atleast 16.5 mg/L, at least 17 mg/L, at least 17.5 mg/L, at least 18 mg/L,at least 18.5 mg/L, at least 19 mg/L, at least 19.5 mg/L, at least 20mg/L, at least 20.5 mg/L, at least 21 mg/L, at least 21.5 mg/L, at least22 mg/L, at least 22.5 mg/L, at least 23 mg/L, at least 23.5 mg/L, atleast 24 mg/L, at least 24.5 mg/L, at least 25 mg/L, at least 25.5 mg/L,at least 26 mg/L, at least 26.5 mg/L, at least 27 mg/L, at least 27.5mg/L, at least 28 mg/L, at least 28.5 mg/L, at least 29 mg/L, at least29.5 mg/L, at least 30 mg/L, at least 30.5 mg/L, at least 31 mg/L, atleast 31.5 mg/L, at least 32 mg/L, at least 32.5 mg/L, at least 33 mg/L,at least 33.5 mg/L, at least 34 mg/L, at least 34.5 mg/L, at least 35mg/L, at least 35.5 mg/L, at least 36 mg/L, at least 36.5 mg/L, at least37 mg/L, at least 37.5 mg/L, at least 38 mg/L, at least 38.5 mg/L, atleast 39 mg/L, at least 39.5 mg/L, at least 40 mg/L, at least 40.5 mg/L,at least 41 mg/L, at least 41.5 mg/L, at least 42 mg/L, at least 42.5mg/L, at least 43 mg/L, at least 43.5 mg/L, at least 44 mg/L, at least44.5 mg/L, at least 45 mg/L, at least 45.5 mg/L, at least 46 mg/L, atleast 46.5 mg/L, at least 47 mg/L, at least 47.5 mg/L, at least 48 mg/L,at least 48.5 mg/L, at least 49 mg/L, at least 49.5 mg/L, at least 50mg/L, at least 50.5 mg/L, at least 51 mg/L, at least 51.5 mg/L, at least52 mg/L, at least 52.5 mg/L, at least 53 mg/L, at least 53.5 mg/L, atleast 54 mg/L, at least 54.5 mg/L, at least 55 mg/L, at least 55.5 mg/L,at least 56 mg/L, at least 56.5 mg/L, at least 57 mg/L, at least 57.5mg/L, at least 58 mg/L, at least 58.5 mg/L, at least 59 mg/L, at least59.5 mg/L, at least 60 mg/L, at least 60.5 mg/L, at least 61 mg/L, atleast 61.5 mg/L, at least 62 mg/L, at least 62.5 mg/L, at least 63 mg/L,at least 63.5 mg/L, at least 64 mg/L, at least 64.5 mg/L, at least 65mg/L, at least 65.5 mg/L, at least 66 mg/L, at least 66.5 mg/L, at least67 mg/L, at least 67.5 mg/L, at least 68 mg/L, at least 68.5 mg/L, atleast 69 mg/L, at least 69.5 mg/L, at least 70 mg/L, at least 70.5 mg/L,at least 71 mg/L, at least 71.5 mg/L, at least 72 mg/L, at least 72.5mg/L, at least 73 mg/L, at least 73.5 mg/L, at least 74 mg/L, at least74.5 mg/L, at least 75 mg/L, at least 75.5 mg/L, at least 76 mg/L, atleast 76.5 mg/L, at least 77 mg/L, at least 77.5 mg/L, at least 78 mg/L,at least 78.5 mg/L, at least 79 mg/L, at least 79.5 mg/L, at least 80mg/L, at least 80.5 mg/L, at least 81 mg/L, at least 81.5 mg/L, at least82 mg/L, at least 82.5 mg/L, at least 83 mg/L, at least 83.5 mg/L, atleast 84 mg/L, at least 84.5 mg/L, at least 85 mg/L, at least 85.5 mg/L,at least 86 mg/L, at least 86.5 mg/L, at least 87 mg/L, at least 87.5mg/L, at least 88 mg/L, at least 88.5 mg/L, at least 89 mg/L, at least89.5 mg/L, at least 90 mg/L, at least 90.5 mg/L, at least 91 mg/L, atleast 91.5 mg/L, at least 92 mg/L, at least 92.5 mg/L, at least 93 mg/L,at least 93.5 mg/L, at least 94 mg/L, at least 94.5 mg/L, at least 95mg/L, at least 95.5 mg/L, at least 96 mg/L, at least 96.5 mg/L, at least97 mg/L, at least 97.5 mg/L, at least 98 mg/L, at least 98.5 mg/L, atleast 99 mg/L, at least 99.5 mg/L, or at least 100 mg/L of a product(e.g., a compound of Formula (4), (5), and/or (6). In some instances,OLSs may form triketide (PDAL) and/or tetraketide (HTAL and olivetol)by-products. Triketides convert to PDAL, and tetraketides convert toHTAL and olivetol, not to olivetolic acid. In some embodiments,production of by-products is undesirable. In some embodiments, OLSenzymes described herein do not produce by-products or produce minimalby-products relative to a control. In some embodiments, OLS enzymes areselected, at least in part, based on the ratio of olivetolic acidproduced relative to olivetol.

It was surprisingly discovered herein that OLSs can exhibit both OLS andOAC activity. PKS enzymes described in this application may or may nothave cyclase activity. In some embodiments where the PKS enzyme does nothave cyclase activity, one or more exogenous polynucleotides that encodea polyketide cyclase (PKC) enzyme may also be co-expressed in the samehost cells to enable conversion of hexanoic acid or butyric acid orother fatty acid conversion into olivetolic acid or divarinolic acid orother precursors of cannabinoids. In some embodiments, the PKS enzymeand a PKC enzyme are expressed as separate and distinct enzymes. In someembodiments, a PKS enzyme that lacks cyclase activity and a PKC arelinked as part of a fusion polypeptide that is a bifunctional PKS. Insome embodiments, a bifunctional PKC is referred to

As used in this application, a bifunctional PKS is an enzyme that iscapable of producing a compound of Formula (6):

from a compound of Formula (2):

and a compound of Formula (3):

In some embodiments, a PKS produces more of a compound of Formula (6):

as compared to a compound of Formula (5):

As a non-limiting example, a compound of Formula (6):

is olivetolic acid (Formula (6a)):

As a non-limiting example, a compound of Formula (5):

is olivetol (Formula (5a)):

In some embodiments, a polyketide synthase of the present disclosure iscapable of catalyzing a compound of Formula (2):

and a compound of Formula (3):

to produce a compound of Formula (4):

and also further catalyzes a compound of Formula (4):

to produce a compound of Formula (6):

In some embodiments, the PKS is not a fusion protein. In someembodiments, a PKS that is capable of catalyzing a compound of Formula(2):

and a compound of Formula (3):

to produce a compound of Formula (4):

and is also capable of further catalyzing the production of a compoundof Formula (6):

from the compound of Formula (4):

is preferred because it avoids the need for an additional polyketidecyclase to produce a compound of Formula (6):

In some embodiments, such an enzyme that is a bifunctional PKSeliminates the transport considerations needed with addition of apolyketide cyclase, whereby the compound of Formula (4), being theproduct of the PKS, must be transported to the PKS for use as asubstrate to be converted into the compound of Formula (6).

In some embodiments, a PKS is capable of producing olivetolic acid inthe presence of a compound of Formula (2a):

and Formula (3a):

In some embodiments, an OLS is capable of producing olivetolic acid inthe presence of a compound of Formula (2a):

and Formula (3a):

Without being bound by a particular theory, the presence of the aminoacid W at a residue in a PKS corresponding to position 339 of SEQ ID NO:6 may render or enhance bifunctionality of a PKS. In some embodiments, abifunctional PKS comprises the amino acid W at a residue correspondingto position 339 of SEQ ID NO: 6. In some embodiments, a bifunctional PKSdoes not comprise the amino acid S at a residue corresponding toposition 339 of SEQ ID NO: 6. As a non-limiting example, a PKS maycomprise the amino acid substitution S332W relative to SEQ ID NO: 5(see, e.g., t606899, SEQ ID NO: 298). In some embodiments, a PKS maycomprise the amino acid substitution S339W relative to SEQ ID NO: 7(see, e.g., t607377, SEQ ID NO: 409) In some embodiments, the PKScomprises a sequence that is at least 5%, at least 10%, at least 15%, atleast 20%, at least 25%, at least 30%, at least 35%, at least 40%, atleast 45%, at least 50%, at least 55%, at least 60%, at least 65%, atleast 70%, at least 71%, at least 72%, at least 73%, at least 74%, atleast 75%, at least 76%, at least 77%, at least 78%, at least 79%, atleast 80%, at least 81%, at least 82%, at least 83%, at least 84%, atleast 85%, at least 86%, at least 87%, at least 88%, at least 89%, atleast 90%, at least 91%, at least 92%, at least 93%, at least 94%, atleast 95%, at least 96%, at least 97%, at least 98%, at least 99%, or is100% identical, including all values in between, to a sequence (e.g.,nucleic acid or amino acid sequence) set forth in SEQ ID NO: 6.

In some embodiments, an OLS is capable of producing olivetolic acid inthe presence of a compound of Formula (2a) and Formula (3a). In someembodiments, the OLS produces more olivetolic acid (OA) than olivetol.In some embodiments, the OLS produces at least 1.1 times, 1.2 times, 1.3times, 1.4 times, 1.5 times, 1.6 times, 1.7 times, 1.8 times, 1.9 times,2 times, 2.1 times, 2.2 times, 2.3 times, 2.4 times, 2.5 times, 2.6times, 2.7 times, 2.8 times, 2.9 times, 3 times, 3.1 times, 3.2 times,3.3 times, 3.4 times, 3.5 times, 3.6 times, 3.7 times, 3.8 times, 3.9times, 4 times, 5 times, 6 times, 8 times, 9 times, 10 times, 20 times,30 times, 40 times, 50 times, 60 times, 70 times, 80 times, 90 times,100 times, 200 times, 300 times, 400 times, 500 times, 600 times, 700times, 800 times or 1,000 times more olivetolic acid (OA) than olivetol.

Without wishing to be bound by any theory, in some embodiments,bifunctional OLSs differ from other OLSs in the geometry of thesubstrate binding pocket, an internal substrate holding cavity, and/or asubstrate exit tunnel. For example, the substrate binding pocket of thebifunctional OLSs may be wider as compared to the substrate bindingpocket of Cannabis sativa OLS (SEQ ID NO: 5). Without wishing to bebound by any theory, this extra space may alleviate steric clashesbetween the protein and substrate and permit the pro-cyclizationconfiguration.

Polyketide Cyclase (PKC)

A host cell described in this application may comprise a PKC. As used inthis application, a “PKC” refers to an enzyme that is capable ofcyclizing a polyketide.

In certain embodiments, a polyketide cyclase (PKC) catalyzes thecyclization of an oxo fatty acyl-CoA (e.g., a compound of Formula (4):

or 3,5,7-trioxododecanoyl-COA, 3,5,7-trioxodecanoyl-COA) to thecorresponding intramolecular cyclization product (e.g., compound ofFormula (6), including olivetolic acid and divarinic acid). In someembodiments, a PKC catalyzes the formation of a compound which occurs inthe presence of a PKS. PKC substrates include trioxoalkanol-CoA, such as3,5,7-Trioxododecanoyl-CoA, or a compound of Formula (4):

wherein R is hydrogen, optionally substituted acyl, optionallysubstituted alkyl, optionally substituted alkenyl, optionallysubstituted alkynyl, optionally substituted carbocyclyl, or optionallysubstituted aryl. In certain embodiments, a PKC catalyzes a compound ofFormula (4):

wherein R is hydrogen, optionally substituted acyl, optionallysubstituted alkyl, optionally substituted alkenyl, optionallysubstituted alkynyl, optionally substituted carbocyclyl, or optionallysubstituted aryl; to form a compound of Formula (6):

wherein R is hydrogen, optionally substituted acyl, optionallysubstituted alkyl, optionally substituted alkenyl, optionallysubstituted alkynyl, optionally substituted carbocyclyl, or optionallysubstituted aryl; as substrates. R is as defined in this application. Insome embodiments, R is a C2-C6 optionally substituted alkyl. In someembodiments, R is a propyl or pentyl. In some embodiments, R is pentyl.In some embodiments, R is propyl. In certain embodiments, a PKC is anolivetolic acid cyclase (OAC). In certain embodiments, a PKC is adivarinic acid cyclase (DAC).

As one of ordinary skill in the art would appreciate a PKC could beobtained from any source, including naturally occurring sources andsynthetic sources (e.g., a non-natually occurring PKC). In someembodiments, a PKC is from Cannabis. Non-limiting examples of PKCsinclude those disclosed in U.S. Pat. Nos. 9,611,460; 10,059,971; and USPub 2019/0169661, which are incorporated by reference in thisapplication in their entireties.

In some embodiments, a PKC is an OAC. As used in this application, an“OAC” refers to an enzyme that is capable of catalyzing the formation ofolivetolic acid (OA). In some embodiments, an OAC is an enzyme that iscapable of using a substrate of Formula (4a)(3,5,7-trioxododecanoyl-CoA):

to form a compound of Formula (6a) (olivetolic acid):

Olivetolic acid cyclase from C. sativa (CsOAC) is a 101 amino acidenzyme that performs non-decaboxylative cyclization of the tetraketideproduct of olivetol synthase (FIG. 4 Structure 4a) via aldolcondensation to form olivetolic acid (FIG. 4 Structure 6a). CsOAC wasidentified and characterized by Gagne et al. (PNAS 2012) viatranscriptome mining, and its cyclization function was recapitulated invitro to demonstrate that CsOAC is required for formation of olivetolicacid in C. sativa. A crystal structure of the enzyme was published byYang et al. (FEBS J. 2016 March; 283(6):1088-106), which revealed thatthe enzyme is a homodimer and belongs to the α+β barrel (DABB)superfamily of protein folds. CsOAC is the only known plant polyketidecyclase. Multiple fungal Type III polyketide synthases have beenidentified that perform both polyketide synthase and cyclizationfunctions (Funa et al., J Biol Chem. 2007 May 11; 282(19):14476-81);however, in plants such a dual function enzyme has not yet beendiscovered.

A non-limiting example of an amino acid sequence encoding OAC in C.sativa is provided by UniProtKB-I6WU39 (SEQ ID NO: 125), which catalyzesthe formation of olivetolic acid (OA) from 3,5,7-Trioxododecanoyl-CoA.

The sequence of UniProtKB-I6WU39 (SEQ ID NO: 125) is:

MAVKHLIVLKFKDEITEAQKEEFFKTYVNLVNIIPAMKDVYWGKDVTQKNKEEGYTHIVEVTFESVETIQDYIIHPAHVGFGDVYRSFWEKLLIFDYTPR K.

A non-limiting example of a nucleic acid sequence encoding C. sativa OACis:

(SEQ ID NO: 130) atggcagtgaagcatttgattgtattgaagttcaaagatgaaatcacagaagcccaaaaggaagaatttttcaagacgtatgtgaatcttgtgaatatcatcccagccatgaaagatgtatactggggtaaagatgtgactcaaaagaataaggaagaagggtacactcacatagttgaggtaacatttgagagtgtggagactattcaggactacattattcatcctgcccatgttggatttggagatgtctatcgttctttctgggaaaaacttctcatttttgactacacaccacga aag.

Prenyltransferase (PT)

A host cell described in this application may comprise aprenyltransferase (PT). As used in this application, a “PT” refers to anenzyme that is capable of transferring prenyl groups to acceptormolecule substrates. Non-limiting examples of prenyltransferases aredescribed in WO2018200888 (e.g., CsPT4), U.S. Pat. No. 8,884,100 (e.g.,CsPT1); CA2718469; Valliere et al., Nat Commun. 2019 Feb. 4; 10(1):565;and Luo et al., Nature 2019 March; 567(7746):123-126, which areincorporated by reference in their entireties. In some embodiments, a PTis capable of producing cannabigerolic acid (CBGA), cannabigerovarinicacid (CBGVA), or other cannabinoids or cannabinoid-like substances. Insome embodiments, a PT is cannabigerolic acid synthase (CBGAS). In someembodiments, a PT is cannabigerovarinic acid synthase (CBGVAS).

In some embodiments, the PT is an NphB prenyltransferase. See, e.g.,U.S. Pat. No. 7,544,498; and Kumano et al., BioorgMed Chem. 2008 Sep. 1;16(17): 8117-8126, which are incorporated by reference in thisapplication in their entireties. In some embodiments, a PT correspondsto NphB from Streptomyces sp. (see, e.g., UniprotKB Accession No.Q4R2T2; see also SEQ ID NO: 2 of U.S. Pat. No. 7,361,483). The proteinsequence corresponding to UniprotKB Accession No. Q4R2T2 is provided bySEQ ID NO: 131:

(SEQ ID NO: 131) MSEAADVERVYAAMEEAAGLLGVACARDKIYPLLSTFQDTLVEGGSVVVFSMASGRHSTELDFSISVPTSHGDPYATVVEKGLFPATGHPVDDLLADTQKHLPVSMFAIDGEVTGGFKKTYAFFPTDNMPGVAELSAIPSMPPAVAENAELFARYGLDKVQMTSMDYKKRQVNLYFSELSAQTLEAESVLALVRELGLHVPNELGLKFCKRSFSVYPTLNWETGKIDRLCFAVISNDPTLVPSSDEGDIEKFHNYATKAPYAYVGEKRTLVYGLTLSPKEEYYKLGAYYHITDVQRGLLK AFDSLED.

A non-limiting example of a nucleic acid sequence encoding NphB is:

(SEQ ID NO: 132) atgtcagaagccgcagatgtcgaaagagtttacgccgctatggaagaagccgccggtttgttaggtgttgcctgtgccagagataagatctacccattgttgtctacttttcaagatacattagttgaaggtggttcagttgttgttttctctatggcttcaggtagacattctacagaattggatttctctatctcagttccaacatcacatggtgatccatacgctactgttgttgaaaaaggtttatttccagcaacaggtcatccagttgatgatttgttggctgatactcaaaagcatttgccagtttctatgtttgcaattgatggtgaagttactggtggtttcaagaaaacttacgctttetttccaactgataacatgccaggtgttgcagaattatctgctattccatcaatgccaccagctgttgcagaaaatgcagaattatttgctagatacggtttggataaggttcaaatgacatctatggattacaagaaaagacaagttaatttgtacttttctgaattatcagcacaaactttggaagctgaatcagttttggcattagttagagaattgggtttacatgttccaaacgaattgggtttgaagttttgtaaaagatctttctcagtttatccaactttaaactgggaaacaggcaagatcgatagattatgtttcgcagttatctctaacgatccaacattggttccatcttcagatgaaggtgatatcgaaaagtttcataactacgctactaaagcaccatatgcttacgttggtgaaaagagaacattagtttatggtttgactttatcaccaaaggaagaatactacaagttgggtgcttactaccacattaccgacgtacaaagaggtttattgaaagcattcgatagtttagaagactaa.

In other embodiments, a PT corresponds to CsPT1, which is disclosed asSEQ ID NO:2 in U.S. Pat. No. 8,884,100 (C. sativa; corresponding to SEQID NO: 110 in this application):

(SEQ ID NO: 110) MGLSSVCTFSFQTNYHTLLNPHNNNPKTSLLCYRHPKTPIKYSYNNFPSKHCSTKSFHLQNKCSESLSIAKNSIRAATTNQTEPPESDNHSVATKILNFGKACWKLQRPYTIIAFTSCACGLFGKELLHNTNLISWSLMFKAFFFLVAILCIASFTTTINQIYDLHIDRINKPDLPLASGEISVNTAWEVISIIVALFGLIITIKMKGGPLYIFGYCFGIFGGIVYSVPPFRWKQNPSTAFLLNFLAHIITNFTFYYASRAALGLPFELRPSFTFLLAFMKSMGSALALIKDASDVEGDTKFGISTLASKYGSRNLTLFCSGIVLLSYVAAILAGIIWPQAFNSNVMLLSHAILAFWLILQTRDFALTNYDPEAGRRFYEFMWKLYYAEYLVYVFI.

In some embodiments, a PT corresponds to CsPT4, which is disclosed asSEQ ID NO:1 in WO2019071000, corresponding to SEQ ID NO: 133 in thisapplication:

(SEQ ID NO: 133) MGLSLVCTFSFQTNYHTLLNPHNKNPKNSLLSYQHPKTPIIKSSYDNFPSKYCLTKNFHLLGLNSHNRISSQSRSIRAGSDQIEGSPHHESDNSIATKILNFGHTCWKLQRPYVVKGMISIACGLFGRELFNNRHLFSWGLMWKAFFALVPILSFNFFAAIMNQIYDVDIDRINKPDLPLVSGEMSIETAWILSIIVALTGLIVTIKLKSAPLFVFIYIFGIFAGFAYSVPPIRWKQYPFTNFLITISSHVGLAFTSYSATTSALGLPFVWRPAFSFITAFMTVMGMTIAFAKDISDIEGDAKYGVSTVATKLGARNIVITFVVSGVLLLNYLVSISIGIIWPQVFKSNEVIILSHAILAFCLIFQTRELALANYASAPSRQFFEFIWLLYYAEYFVYVF I.

In some embodiments, a PT corresponds to a truncated CsPT4, which isprovided as SEQ ID NO: 134 herein:

MSAGSDQIEGSPHHESDNSIATKILNFGHTCWKLQRPYVVKGMISIACGLFGRELFNNRHLFSWGLMWKAFFALVPILSFNFFAAIMNQIYDVDIDRINKPDLPLVSGEMSIETAWILSIIVALTGLIVTIKLKSAPLFVFIYIFGIFAGFAYSVPPIRWKQYPFTNFLITISSHVGLAFTSYSATTSALGLPFVWRPAFSFITAFMTVMGMTIAFAKDISDIEGDAKYGVSTVATKLGARNMTFVVSGVLLLNYLVSISIGIIWPQVFKSNEVIILSHAILAFCLIFQTRELALANYASAPSRQFFEFIWLLYYAEYFVYVFI.

Functional expression of paralog C. sativa CBGAS enzymes in S.cerevisiae and production of the major cannabinoid CBGA has beenreported (Page and Boubakir US 20120144523, 2012, and Luo et al. Nature,2019). Luo et al. reported the production of CBGA in S. cerevisiae byexpressing a truncated version of a C. sativa CBGAS, CsPT4, with itsnative signal peptide removed (Luo et al. Nature, 2019). Without beingbound by a particular theory, the integral-membrane nature of C. sativaCBGAS enzymes may render functional expression of C. sativa CBGASenzymes in heterologous hosts challenging. Removal of transmembranedomain(s) or signal sequences or use of prenyltransferases that are notassociated with the membrane and are not integral membrane proteins mayfacilitate increased interaction between the enzyme and availablesubstrate, for example in the cellular cytosol and/or in organelles thatmay be targeted using peptides that confer localization.

In some embodiments, the PT is a soluble PT. In some embodiments, the PTis a cytosolic PT. In some embodiments, the PT is a secreted protein. Insome embodiments, the PT is not a membrane-associated protein. In someembodiments, the PT is not an integral membrane protein. In someembodiments, the PT does not comprise a transmembrane domain or apredicted transmembrane. In some embodiments, the PT may be primarilydetected in the cytosol (e.g., detected in the cytosol to a greaterextent than detected associated with the cell membrane). In someembodiments, the PT is a protein from which one or more transmembranedomains have been removed and/or mutated (e.g., by truncation,deletions, substitutions, insertions, and/or additions) so that the PTlocalizes or is predicted to localize in the cytosol of the host cell,or to cytosolic organelles within the host cell, or, in the case ofbacterial hosts, in the periplasm. In some embodiments, the PT is aprotein from which one or more transmembrane domains have been removedor mutated (e.g., by truncation, deletions, substitutions, insertions,and/or additions) so that the PT has increased localization to thecytosol, organelles, or periplasm of the host cell, as compared tomembrane localization.

Within the scope of the term “transmembrane domains” are predicted orputative transmembrane domains in addition to transmembrane domains thathave been empirically determined. In general, transmembrane domains arecharacterized by a region of hydrophobicity that facilitates integrationinto the cell membrane. Methods of predicting whether a protein is amembrane protein or a membrane-associated protein are known in the artand may include, for example, amino acid sequence analysis, hydropathyplots, and/or protein localization assays.

In some embodiments, the PT is a protein from which a signal sequencehas been removed and/or mutated so that the PT is not directed to thecellular secretory pathway. In some embodiments, the PT is a proteinfrom which a signal sequence has been removed and/or mutated so that thePT is localized to the cytosol or has increased localization to thecytosol (e.g., as compared to the secretory pathway).

In some embodiments, the PT is a secreted protein. In some embodiments,the PT contains a signal sequence.

In some embodiments, a PT is a fusion protein. For example, a PT may befused to one or more genes in the metabolic pathway of a host cell. Incertain embodiments, a PT may be fused to mutant forms of one or moregenes in the metabolic pathway of a host cell.

In some embodiments, a PT described in this application transfers one ormore prenyl groups to any of positions 1, 2, 3, 4, or 5 in a compound ofFormula (6), shown below:

In some embodiments, the PT transfers a prenyl group to any of positions1, 2, 3, 4, or 5 in a compound of Formula (6), shown below:

to form a compound of one or more of Formula (8w), Formula (8x), Formula(8′), Formula (8y), Formula (8z):

or a pharmaceutically acceptable salt, solvate, hydrate, polymorph,co-crystal, tautomer, stereoisomer, isotopically labeled derivative, orprodrug thereof, wherein a is 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10.

Terminal Synthases (TS)

A host cell described in this application may comprise a terminalsynthase (TS). As used in this application, a “TS” refers to an enzymethat is capable of catalyzing oxidative cyclization of a prenyl moiety(e.g., terpene) to produce a ring-containing product (e.g., heterocyclicring-containing product). In certain embodiments, a TS is capable ofcatalyzing oxidative cyclization of a prenyl moiety (e.g., terpene) toproduce a carbocyclic-ring containing product (e.g., cannabinoid). Incertain embodiments, a TS is capable of catalyzing oxidative cyclizationof a prenyl moiety (e.g., terpene) to produce a heterocyclic-ringcontaining product (e.g., cannabinoid). In certain embodiments, a TS iscapable of catalyzing oxidative cyclization of a prenyl moiety (e.g.,terpene) to produce a cannabinoid.

In some embodiments, a TS is a tetrahydrocannabinolic acid synthase(THCAS), a cannabidiolic acid synthase (CBDAS), and/or acannabichromenic acid synthase (CBCAS). As one of ordinary skill in theart would appreciate a TS could be obtained from any source, includingnaturally occurring sources and synthetic sources (e.g., a non-natuallyoccurring TS).

a. Substrates

A TS may be capable of using one or more substrates. In some instances,the location of the prenyl group and/or the R group differs between TSsubstrates. For example, a TS may be capable of using as a substrate oneor more compounds of Formula (8w), Formula (8x), Formula (8′), Formula(8y), and/or Formula (8z):

or a pharmaceutically acceptable salt, solvate, hydrate, polymorph,co-crystal, tautomer, stereoisomer, isotopically labeled derivative, orprodrug thereof, wherein a is 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10.

In certain embodiments, a compound of Formula (8′) is a compound ofFormula (8):

In some embodiments, a TS catalyzes oxidative cyclization of the prenylmoiety (e.g., terpene) of a compound of Formula (8) described in thisapplication and shown in FIG. 2. In certain embodiments, a compound ofFormula (8) is a compound of Formula (8a):

b. Products

In embodiments wherein CBGA is the substrate, the TS enzymes CBDAS,THCAS and CBCAS would generally catalyze the formation of cannabidiolicacid (CBDA), A9-tetrahydrocannabinolic acid (THCA) and cannabichromenicacid (CBCA), respectively. However, in some embodiments, a TS canproduce more than one different product depending on reactionconditions. For example, the pH of the reaction environment may cause aTHCAS or a CBDAS to produce CBCA in greater proportions than THCA orCBDAS, respectively (see, for example, U.S. Pat. No. 9,359,625 toWinnicki and Donsky, incorporated by reference in its entirety).

A TS may be capable of using one or more substrates described in thisapplication to produce one or more products. Non-limiting example of TSproducts are shown in Table 1. In some instances, a TS is capable ofusing one substrate to produce 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10different products. In some embodiments, a TS is capable of using morethan one substrate to produce 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 differentproducts.

In some embodiments, a TS is capable of producing a compound of Formula(X-A) and/or a compound of Formula (X-B):

or a pharmaceutically acceptable salt, solvate, hydrate, polymorph,co-crystal, tautomer, stereoisomer, isotopically labeled derivative, orprodrug thereof,wherein

is a double bond or a single bond, as valency permits;

R is hydrogen, optionally substituted acyl, optionally substitutedalkyl, optionally substituted alkenyl, optionally substituted alkynyl,optionally substituted carbocyclyl, or optionally substituted aryl;

R^(Z1) is hydrogen, optionally substituted acyl, optionally substitutedalkyl, optionally substituted alkenyl, optionally substituted alkynyl,optionally substituted carbocyclyl, or optionally substituted aryl;

R^(Z2) is hydrogen, optionally substituted acyl, optionally substitutedalkyl, optionally substituted alkenyl, optionally substituted alkynyl,optionally substituted carbocyclyl, or optionally substituted aryl;

or optionally, R^(Z1) and R^(Z2) are taken together with theirintervening atoms to form an optionally substituted carbocyclic ring;

R^(3A) is hydrogen, optionally substituted acyl, optionally substitutedalkyl, optionally substituted alkenyl, or optionally substitutedalkynyl;

R^(3B) is hydrogen, optionally substituted acyl, optionally substitutedalkyl, optionally substituted alkenyl, or optionally substitutedalkynyl; and/or

R^(Y) is hydrogen, optionally substituted acyl, optionally substitutedalkyl, optionally substituted alkenyl, or optionally substitutedalkynyl.

In some embodiments, a compound of Formula (X-A) is:

In certain embodiments, a compound of Formula (10)

has a chiral atom labeled with * at carbon 10 and a chiral atom labeledwith ** at carbon 6. In certain embodiments, in a compound of Formula(10)

the chiral atom labeled with * at carbon 10 is of the R-configuration orS-configuration; and a chiral atom labeled with ** at carbon 6 is of theR-configuration. In certain embodiments, in a compound of Formula (10)

the chiral atom labeled with * at carbon 10 is of the S-configuration;and a chiral atom labeled with ** at carbon 6 is of the R-configurationor S-configuration. In certain embodiments, in a compound of Formula(10)

the chiral atom labeled with * at carbon 10 is of the R-configurationand a chiral atom labeled with ** at carbon 6 is of the R-configuration.In certain embodiments, a compound of Formula (10)

is of the formula:

In certain embodiments, in a compound of Formula (10)

the chiral atom labeled with * at carbon 10 is of the S-configurationand a chiral atom labeled with ** at carbon 6 is of the S-configuration.In certain embodiments, a compound of Formula (10)

is of the formula:

In certain embodiments, a compound of Formula (10a)

has a chiral atom labeled with * at carbon 10 and a chiral atom labeledwith ** at carbon 6. In certain embodiments, in a compound of Formula(10a)

the chiral atom labeled with * at carbon 10 is of the R-configuration orS-configuration; and a chiral atom labeled with ** at carbon 6 is of theR-configuration. In certain embodiments, in a compound of Formula (10a)

the chiral atom labeled with * at carbon 10 is of the S-configuration;and a chiral atom labeled with ** at carbon 6 is of the R-configurationor S-configuration. In certain embodiments, in a compound of Formula(10a)

the chiral atom labeled with * at carbon 10 is of the R-configurationand a chiral atom labeled with ** at carbon 6 is of the R-configuration.In certain embodiments, a compound of Formula (10a)

is of the formula:

In certain embodiments, in a compound of Formula (10a)

the chiral atom labeled with * at carbon 10 is of the S-configurationand a chiral atom labeled with ** at carbon 6 is of the S-configuration.In certain embodiments, a compound of Formula (10a)

is of the formula:

In some embodiments, a compound of Formula (X-A) is:

In some embodiments, a compound of Formula (X-A) is:

In some embodiments, a compound of Formula (X-B) is:

In certain embodiments, a compound of Formula (9)

has a chiral atom labeled with * at carbon 3 and a chiral atom labeledwith ** at carbon 4. In certain embodiments, in a compound of Formula(9)

the chiral atom labeled with * at carbon 3 is of the R-configuration orS-configuration; and a chiral atom labeled with ** at carbon 4 is of theR-configuration. In certain embodiments, in a compound of Formula (9)

the chiral atom labeled with * at carbon 3 is of the S-configuration;and a chiral atom labeled with ** at carbon 4 is of the R-configurationor S-configuration. In certain embodiments, in a compound of Formula (9)

the chiral atom labeled with * at carbon 3 is of the R-configuration anda chiral atom labeled with ** at carbon 4 is of the R-configuration. Incertain embodiments, a compound of Formula (9)

is of the formula:

In certain embodiments, in a compound of Formula (9)

the chiral atom labeled with * at carbon 3 is of the S-configuration anda chiral atom labeled with ** at carbon 4 is of the S-configuration. Incertain embodiments, a compound of Formula (9)

is of the formula:

In certain embodiments, a compound of Formula (9a) (CBDA)

has a chiral atom labeled with * at carbon 3 and a chiral atom labeledwith ** at carbon 4. In certain embodiments, in a compound of Formula(9a)

the chiral atom labeled with * at carbon 3 is of the R-configuration orS-configuration; and a chiral atom labeled with ** at carbon 4 is of theR-configuration. In certain embodiments, in a compound of Formula (9a)

the chiral atom labeled with * at carbon 3 is of the S-configuration;and a chiral atom labeled with ** at carbon 4 is of the R-configurationor S-configuration. In certain embodiments, in a compound of Formula(9a)

the chiral atom labeled with * at carbon 3 is of the R-configuration anda chiral atom labeled with ** at carbon 4 is of the R-configuration. Incertain embodiments, a compound of Formula (9a)

is of the formula:

In certain embodiments, in a compound of Formula (9a)

the chiral atom labeled with * at carbon 3 is of the S-configuration anda chiral atom labeled with ** at carbon 4 is of the S-configuration. Incertain embodiments, a compound of Formula (9a

is of the formula:

In some embodiments, as shown in FIG. 2, a TS is capable of producing acannabinoid from the product of a PT, including, without limitation, anenzyme capable of producing a compound of Formula (9), (10), or (11):

or a pharmaceutically acceptable salt, solvate, hydrate, polymorph,co-crystal, tautomer, stereoisomer, isotopically labeled derivative, orprodrug thereof, wherein R is hydrogen, optionally substituted acyl,optionally substituted alkyl, optionally substituted alkenyl, optionallysubstituted alkynyl, optionally substituted carbocyclyl, or optionallysubstituted aryl; produced from a compound of Formula (8′):

wherein a is 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10; and R is hydrogen,optionally substituted acyl, optionally substituted alkyl, optionallysubstituted alkenyl, optionally substituted alkynyl, optionallysubstituted carbocyclyl, or optionally substituted aryl; or using anyother substrate. In certain embodiments, a compound of Formula (8′) is acompound of Formula (8):

In certain embodiments, a compound of Formula (9), (10), or (11) isproduced using a TS from a substrate compound of Formula (8′) (e.g.,compound of Formula (8)), for example. Non-limiting examples ofsubstrate compounds of Formula (8′) include but are not limited tocannabigerolic acid (CBGA), cannabigerovarinic acid (CBGVA), orcannabinerolic acid. In certain embodiments, at least one of thehydroxyl groups of the product compounds of Formula (9), (10), or (11)is further methylated. In certain embodiments, a compound of Formula (9)is methylated to form a compound of Formula (12):

or a pharmaceutically acceptable salt, solvate, hydrate, polymorph,co-crystal, tautomer, stereoisomer, isotopically labeled derivative, orprodrug thereof.

Tetrahydrocannabinolic Acid Synthase (THCAS)

A host cell described in this application may comprise a TS that is atetrahydrocannabinolic acid synthase (THCAS). As used in thisapplication “tetrahydrocannabinolic acid synthase (THCAS)” or“Δ¹-tetrahydrocannabinolic acid (THCA) synthase” refers to an enzymethat is capable of catalyzing oxidative cyclization of a prenyl moiety(e.g., terpene) of a compound of Formula (8) to produce aring-containing product (e.g., heterocyclic ring-containing product,carbocyclic-ring containing product) of Formula (10). In certainembodiments, a THCAS refers to an enzyme that is capable of producingΔ9-tetrahydrocannabinolic acid (Δ9-THCA, THCA,Δ9-Tetrahydro-cannabivarinic acid A (Δ9-THCVA-C3 A), THCVA, THCP, or acompound of Formula 10(a), from a compound of Formula (8). In certainembodiments, a THCAS is capable of producing Δ⁹-tetrahydrocannabinolicacid (Δ⁹-THCA, THCA, or a compound of Formula 10(a)).

A THCAS may use cannabigerolic acid (CBGA) as a substrate. In someembodiments, the THCAS produces Δ⁹-THCA from CBGA. In some embodiments,a THCAS may catalyze the oxidative cyclization of other substrates, suchas 3-geranyl-2,4-dihydro-6-alkylbenzoic acids. In some embodiments, aTHCAS may catalyze the oxidative cyclization of other substrates, suchas 3-geranyl-2,4-dihydro-6-alkylbenzoic acids like cannabigerovarinicacid (CBGVA)). In some embodiments, a THCAS exhibits specificity forCBGA substrates. In some embodiments, a THCAS may use cannabivarinicacid (CBDVA) as a substrate. In some embodiments, the THCAS exhibitsspecificity for CBDVA substrates. In some embodiments, a THCAS may usecannabiphorol acid (CBDP) as a substrate. In some embodiments, the THCASexhibits specificity for CBDP substrates.

In some embodiments, a THCAS is from C. sativa. C. sativa THCAS performsthe oxidative cyclization of the geranyl moiety of Cannabigerolic Acid(CBGA) (FIG. 4 Structure 8a) to form Tetrahydrocannabinolic Acid (FIG. 4Structure 10a) using covalently bound flavin adenine dinucleotide (FAD)as a cofactor and molecular oxygen as the final electron acceptor. THCASwas first discovered and characterized by Taura et al. (JACS. 1995)following extraction of the enzyme from the leaf buds of C. sativa andconfirmation of its THCA synthase activity in vitro upon the addition ofCBGA as a substrate. Additional analysis indicated that the enzyme is amonomer and possesses FAD binding and Berberine Bridge Enzyme (BBE)sequence motifs. A crystal structure of the enzyme published by Shoyamaet al. (J Mol Biol. 2012 Oct. 12; 423(1):96-105) revealed that theenzyme covalently binds to a molecule of the cofactor FAD. See also,e.g., Sirikantarams et al., J. Biol. Chem. 2004 Sep. 17;279(38):39767-39774. There are several THCAS isozymes in Cannabissativa.

In some embodiments, a C. sativa THCAS (Uniprot KB Accession No.:I1V0C5) comprises the amino acid sequence shown below:

(SEQ ID NO: 135) MNCSAFSFWFVCKIIFFFLSFNIQISIANPQENFLKCFSEYIPNNPANPKFIYTQHDQLYMSVLNSTIQNLRFTSDTTPKPLVIVTPSNVSHIQASILCSKKVGLQIRTRSGGHDAEGMSYISQVPFVVVDLRNMHSIKIDVHSQTAWVEAGATLGEVYYWINEKNENFSFPGGYCPTVGVGGHFSGGGYGALMRNYGLAADNIIDAHLVNVDGKVLDRKSMGEDLFWAIRGGGGENFGIIAAWKIKLVAVPSKSTIFSVKKNIVIEIHGLVKLFNKWQNIAYKYDKDLVLMTHFITKNITDNHGKNKTTVHGYFSSIFHGGVDSLVDLMNKSFPELGIKKTDCKEFSWIDTTIFYSGVVNFNTANFKKEILLDRSAGKKTAFSIKLDYVKKPIPETAMVKILEKLYEEDVGVGMYVLYPYGGIIVIEEISESAIPFPHRAGIIVIYELWYTASWEKQEDNEKHINWVRSVYNFTTPYVSQNPRLAYLNYRDLDLGKTNPESPNNYTQARIWGEKYFGKNFNRLVKVKTKADPNNFFRNEQSIPPLPPHH H.

In some embodiments, a THCAS comprises the sequence shown below:

(SEQ ID NO: 136) NPQENFLKCFSEYIPNNPANPKFIYTQHDQLYMSVLNSTIQNLRFTSDTTPKPLVIVTPSNVSHIQASILCSKKVGLQIRTRSGGHDAEGMSYISQVPFVVVDLRNMHSIKIDVHSQTAWVEAGATLGEVYYWINEKNENFSFPGGYCPTVGVGGHFSGGGYGALMRNYGLAADNIIDAHLVNVDGKVLDRKSMGEDLFWAIRGGGGENFGIIAAWKIKLVAVPSKSTIFSVKKNMEIHGLVKLFNKWQNIAYKYDKDLVLMTHFITKNITDNHGKNKTTVHGYFSSIFHGGVDSLVDLMNKSFPELGIKKTDCKEFSWIDTTIFYSGVVNFNTANFKKEILLDRSAGKKTAFSIKLDYVKKPIPETAMVKILEKLYEEDVGVGMYVLYPYGGEVIEEISESAIPFPHRAGEVIYELWYTASWEKQEDNEKHINWVRSVYNFTTPYVSQNPRLAYLNYRDLDLGKTNPESPNNYTQARIWGEKYFGKNFNRLVKVKTKAD PNNFFRNEQSIPPLPPHHH.

A non-limiting example of a nucleotide sequence encoding SEQ ID NO: 136is:

(SEQ ID NO: 137) aacccgcaagaaaactttctaaaatgcttttctgaatacattcctaacaaccctgccaacccgaagtttatctacacacaacacgatcaattgtatatgagcgtgttgaatagtacaatacagaacctgaggtttacatccgacacaacgccgaaaccgctagtgatcgtcacaccctccaacgtaagccacattcaggcaagcattttatgcagcaagaaagtcggactgcagataaggacgaggtccggaggacacgacgccgaagggatgagctatatctcccaggtaccttttgtggtggtagacttgagaaatatgcactctatcaagatagacgttcactcccaaaccgcttgggttgaggcgggagccacccttggtgaggtctactactggatcaacgaaaagaatgaaaattttagctttcctgggggatattgcccaactgtaggtgttggcggccacttctcaggaggeggttatggggccttgatgcgtaactacggacttgeggccgacaacattatagacgcacatctagtgaatgtagacggcaaagttttagacaggaagagcatgggtgaggatcttttttgggcaattagaggcggagggggagaaaattttggaattatcgctgcttggaaaattaagctagttgcggtaccgagcaaaagcactatattctctgtaaaaaagaacatggagatacatggtttggtgaagctttttaataagtggcaaaacatcgcgtacaagtacgacaaagatctggttctgatgacgcattttataacgaaaaatatcaccgacaaccacggaaaaaacaaaaccacagtacatggctacttctctagtatatttcatgggggagtcgattctctggttgatttaatgaacaaatcattcccagagttgggtataaagaagacagactgtaaggagttctcttggattgacacaactatattctattcaggcgtagtcaactttaacacggcgaatttcaaaaaagagatccttctggacagatccgcaggtaagaaaactgcgttctctatcaaattggactatgtgaagaagcctattcccgaaaccgcgatggtcaagatacttgagaaattatacgaggaagatgtgggagttggaatgtacgtactttatccctatggtgggataatggaagaaatcagcgagagcgccattccatttccccatcgtgccggcatcatgtacgagctgtggtatactgcgagttgggagaagcaagaagacaacgaaaagcacattaactgggtcagatcagtttacaatttcaccaccccatacgtgtcccagaatccgcgtctggcttacttgaactaccgtgatcttgacctgggtaaaacgaacccggagtcacccaacaattacactcaagctagaatctggggagagaaatactttgggaagaacttcaacaggttagtaaaggttaaaaccaaggcagatccaaacaacttttttagaaatgaacaatccattcccccgctacccccgcaccatca c.

In some embodiments, a C. sativa THCAS comprises the amino acid sequenceset forth in UniProtKB-Q8GTB6 (SEQ ID NO: 112):

MNCSAFSFWFVCKIIFFFLSFHIQISIANPRENFLKCFSKHIPNNVANPKLVYTQHDQLYMSILNSTIQNLRFISDTTPKPLVIVTPSNNSHIQATILCSKKVGLQIRTRSGGHDAEGMSYISQVPFVVVDLRNMHSIKIDVHSQTAWVEAGATLGEVYYWINEKNENLSFPGGYCPTVGVGGHFSGGGYGALMRNYGLAADNIIDAHLVNVDGKVLDRKSMGEDLFWAIRGGGGENFGIIAAWKIKLVAVPSKSTIFSVKKNMEIHGLVKLFNKWQNIAYKYDKDLVLMTHFITKNITDNHGKNKTTVHGYFSSIFHGGVDSLVDLMNKSFPELGIKKTDCKEFSWIDTTIFYSGVVNFNTANFKKEILLDRSAGKKTAFSIKLDYVKKPIPETAMVKILEKLYEEDVGAGMYVLYPYGGEVIEEISESAIPFPHRAGEVIYELWYTASWEKQEDNEKHINWVRSVYNFTTPYVSQNPRLAYLNYRDLDLGKTNHASPNNYTQARIWGEKYFGKNFNRLVKVKTKVDPNNFFRNEQSIPPLPPHHH.

Additional non-limiting examples of THCAS enzymes may also be found inU.S. Pat. No. 9,512,391 and US Publication No. 2018/0179564, which areincorporated by reference in this application in their entireties.

Cannabidiolic Acid Synthase (CBDAS)

A host cell described in this application may comprise a TS that is acannabidiolic acid synthase (CBDAS). As used in this application, a“CBDAS” refers to an enzyme that is capable of catalyzing oxidativecyclization of a prenyl moiety (e.g., terpene) of a compound of Formula(8) to produce a compound of Formula 9. In some embodiments, a compoundof Formula 9 is a compound of Formula (9a) (cannabidiolic acid (CBDA)),CBDVA, or CBDP. A CBDAS may use cannabigerolic acid (CBGA) orcannabinerolic acid as a substrate. In some embodiments, a cannabidiolicacid synthase is capable of oxidative cyclization of cannabigerolic acid(CBGA) to produce cannabidiolic acid (CBDA). In some embodiments, theCBDAS may catalyze the oxidative cyclization of other substrates, suchas 3-geranyl-2,4-dihydro-6-alkylbenzoic acids like cannabigerovarinicacid (CBVGA). In some embodiments, the CBDAS exhibits specificity forCBGA substrates.

In some embodiments, a CBDAS is from Cannabis. In C. sativa, CBDAS isencoded by the CBDAS gene and is a flavoenzyme. A non-limiting exampleof an amino acid sequence encoding CBDAS is provided by UniProtKB-A6P6V9(SEQ ID NO: 111) from C. sativa:

MKCSTFSFWFVCKIIFFFFSFNIQTSIANPRENFLKCFSQYIPNNATNLKLVYTQNNPLYMSVLNSTIHNLRFTSDTTPKPLVIVTPSHVSHIQGTILCSKKVGLQIRTRSGGHDSEGMSYISQVPFVIVDLRNMRSIKIDVHSQTAWVEAGATLGEVYYWVNEKNENLSLAAGYCPTVCAGGHFGGGGYGPLMRNYGLAADNIIDAHLVNVHGKVLDRKSMGEDLFWALRGGGAESFGIIVAWKIRLVAVPKSTMFSVKKEVIEIHELVKLVNKWQNIAYKYDKDLLLMTHFITRNITDNQGKNKTAIHTYFSSVFLGGVDSLVDLMNKSFPELGIKKTDCRQLSWIDTIIFYSGVVNYDTDNFNKEILLDRSAGQNGAFKIKLDYVKKPIPESVFVQILEKLYEEDIGAGMYALYPYGGIMDEISESAIPFPHRAGILYELWYICSWEKQEDNEKHLNWIRNIYNFMTPYVSKNPRLAYLNYRDLDIGINDPKNPNNYTQARIWGEKYFGKNFDRLVKVKTLVDPNNFFRNEQSIPPLPRHRH.

Additional non-limiting examples of CBDAS enzymes may also be found inU.S. Pat. No. 9,512,391 and US Publication No. 2018/0179564, which areincorporated by reference in this application in their entireties.

Cannabichromenic Acid Synthase (CBCAS)

A host cell described in this application may comprise a TS that is acannabichromenic acid synthase (CBCAS). As used in this application, a“CBCAS” refers to an enzyme that is capable of catalyzing oxidativecyclization of a prenyl moiety (e.g., terpene) of a compound of Formula(8) to produce a compound of Formula (11). In some embodiments, acompound of Formula (11) is a compound of Formula (11a)(cannabichromenic acid (CBCA)), CBCVA, or CBCPA. A CBCAS may usecannabigerolic acid (CBGA) as a substrate. In some embodiments, a CBCASproduces cannabichromenic acid (CBCA) from cannabigerolic acid (CBGA).In some embodiments, the CBCAS may catalyze the oxidative cyclization ofother substrates, such as 3-geranyl-2,4-dihydro-6-alkylbenzoic acidslike cannabigerovarinic acid (CBVGA), or CBCPA. In some embodiments, theCBCAS exhibits specificity for CBGA substrates.

In some embodiments, a CBCAS is from Cannabis. In C. sativa, an aminoacid sequence encoding CBCAS is provided by, and incorporated byreference from, SEQ ID NO:2 disclosed in U.S. Patent Publication No.20170211049. In other embodiments, a CBCAS may be a THCAS described inand incorporated by reference from U.S. Pat. No. 9,359,625. SEQ ID NO:2disclosed in U.S. Patent Publication No. 20170211049 (corresponding toSEQ ID NO: 113 in this application) has the amino acid sequence:

MNCSTFSFWFVCKIIFFFLSFNIQISIANPQENFLKCFSEYIPNNPANPKFIYTQHDQLYMSVLNSTIQNLRFTSDTTPKPLVIVTPSNVSHIQASILCSKKVGLQIRTRSGGHDAEGLSYISQVPFAIVDLRNMHTVKVDIHSQTAWVEAGATLGEVYYWINEMNENFSFPGGYCPTVGVGGHFSGGGYGALMRNYGLAADNIIDAHLVNVDGKVLDRKSMGEDLFWAIRGGGGENFGIIAACKIKLVVVPSKATIFSVKKNMEIHGLVKLFNKWQNIAYKYDKDLMLTTHFRTRNITDNHGKNKTTVHGYFSSIFLGGVDSLVDLMNKSFPELGIKKTDCKELSWIDTTIFYSGVVNYNTANFKKEILLDRSAGKKTAFSIKLDYVKKLIPETAMVKILEKLYEEEVGVGMYVLYPYGGIMDEISESAIPFPHRAGEVIYELWYTATWEKQEDNEKHINWVRSVYNFTTPYVSQNPRLAYLNYRDLDLGKTNPESPNNYTQARIWGEKYFGKNFNRLVKVKTKADPNNFFRNEQSIPPLPPRHH.

Any of the enzymes, host cells, and methods described in thisapplication may be used for the production of cannabinoids andcannabinoid precursors, such as those provided in Table 1. In general,the term “production” is used to refer to the generation of one or moreproducts (e.g., products of interest and/or by-products/off-products),for example, from a particular substrate or reactant. The amount ofproduction may be evaluated at any one or more steps of a pathway, suchas a final product or an intermediate product, using metrics familiar toone of ordinary skill in the art. For example, the amount of productionmay be assessed for a single enzymatic reaction (e.g., conversion of acompound of Formula (8) to a compound of Formula (10) by a TS).Alternatively or in addition, the amount of production may be assessedfor a series of enzymatic reactions (e.g., the biosynthetic pathwayshown in FIG. 1 and/or FIG. 2). Production may be assessed by anymetrics known in the art, for example, by assessing volumetricproductivity, enzyme kinetics/reaction rate, specific productivitybiomass-specific productivity, titer, yield, and total titer of one ormore products (e.g., products of interest and/orby-products/off-products).

In some embodiments, the metric used to measure production may depend onwhether a continuous process is being monitored (e.g., severalcannabinoid biosynthesis steps are used in combination) or whether aparticular end product is being measured. For example, in someembodiments, metrics used to monitor production by a continuous processmay include volumetric productivity, enzyme kinetics and reaction rate.In some embodiments, metrics used to monitor production of a particularproduct may include specific productivity, biomass-specificproductivity, titer, yield, and/or total titer of one or more products(e.g., products of interest and/or by-products/off-products).

Production of one or more products (e.g., products of interest and/orby-products/off-products) may be assessed indirectly, for example bydetermining the amount of a substrate remaining following termination ofthe reaction/fermentation. For example, for a TS that catalyzes theformation of products (e.g., a compound of Formula (10), includingtetrahydrocannabinolic acid (THCA) (Formula (10a)) from a compound ofFormula (8), including CBGA (Formula 8(a))), production of the productsmay be assessed by quantifying the compound of Formula (10) directly orby quantifying the amount of substrate remaining following the reaction(e.g., amount of the compound of Formula (8)).

Variants

Aspects of the disclosure relate to nucleic acids encoding any of thepolypeptides (e.g., AAE, PKS, PKC, PT, or TS) described in thisapplication. In some embodiments, a nucleic acid encompassed by thedisclosure is a nucleic acid that hybridizes under high or mediumstringency conditions to a nucleic acid encoding an AAE, PKS, PKC, PT,or TS and is biologically active. For example, high stringencyconditions of 0.2 to 1×SSC at 65° C. followed by a wash at 0.2×SSC at65° C. can be used. In some embodiments, a nucleic acid encompassed bythe disclosure is a nucleic acid that hybridizes under low stringencyconditions to a nucleic acid encoding an AAE, PKS, PKC, PT, or TS and isbiologically active. For example, low stringency conditions of 6×SSC atroom temperature followed by a wash at 2×SSC at room temperature can beused. Other hybridization conditions include 3×SSC at 40 or 50° C.,followed by a wash in 1 or 2×SSC at 20, 30, 40, 50, 60, or 65° C.

Hybridizations can be conducted in the presence of formaldehyde, e.g.,10%, 20%, 30% 40% or 50%, which further increases the stringency ofhybridization. Theory and practice of nucleic acid hybridization isdescribed, e.g., in S. Agrawal (ed.) Methods in Molecular Biology,volume 20; and Tijssen (1993) Laboratory Techniques in biochemistry andmolecular biology-hybridization with nucleic acid probes, e.g., part Ichapter 2 “Overview of principles of hybridization and the strategy ofnucleic acid probe assays,” Elsevier, New York provide a basic guide tonucleic acid hybridization.

Variants of enzyme sequences described in this application (e.g., AAE,PKS, PKC, PT, or TS, including nucleic acid or amino acid sequences) arealso encompassed by the present disclosure. A variant may share at least5%, at least 10%, at least 15%, at least 20%, at least 25%, at least30%, at least 35%, at least 40%, at least 45%, at least 50%, at least55%, at least 60%, at least 65%, at least 70%, at least 71%, at least72%, at least 73%, at least 74%, at least 75%, at least 76%, at least77%, at least 78%, at least 79%, at least 80%, at least 81%, at least82%, at least 83%, at least 84%, at least 85%, at least 86%, at least87%, at least 88%, at least 89%, at least 90%, at least 91%, at least92%, at least 93%, at least 94%, at least 95%, at least 96%, at least97%, at least 98%, at least 99%, or 100% sequence identity with areference sequence, including all values in between.

Unless otherwise noted, the term “sequence identity,” as known in theart, refers to a relationship between the sequences of two polypeptidesor polynucleotides, as determined by sequence comparison (alignment). Insome embodiments, sequence identity is determined across the entirelength of a sequence (e.g., AAE, PKS, PKC, PT, or TS sequence). In someembodiments, sequence identity is determined over a region (e.g., astretch of amino acids or nucleic acids, e.g., the sequence spanning anactive site) of a sequence (e.g., AAE, PKS, PKC, PT, or TS sequence).

Identity can also refer to the degree of sequence relatedness betweentwo sequences as determined by the number of matches between strings oftwo or more residues (e.g., nucleic acid or amino acid residues).Identity measures the percent of identical matches between the smallerof two or more sequences with gap alignments (if any) addressed by aparticular mathematical model, algorithms, or computer program.

Identity of related polypeptides or nucleic acid sequences can bereadily calculated by any of the methods known to one of ordinary skillin the art. The “percent identity” of two sequences (e.g., nucleic acidor amino acid sequences) may, for example, be determined using thealgorithm of Karlin and Altschul Proc. Natl. Acad. Sci. USA 87:2264-68,1990, modified as in Karlin and Altschul Proc. Natl. Acad. Sci. USA90:5873-77, 1993. Such an algorithm is incorporated into the NBLAST® andXBLAST® programs (version 2.0) of Altschul et al., J. Mol. Biol.215:403-10, 1990. BLAST® protein searches can be performed, for example,with the XBLAST program, score=50, wordlength=3 to obtain amino acidsequences homologous to the proteins described in this application.Where gaps exist between two sequences, Gapped BLAST® can be utilized,for example, as described in Altschul et al., Nucleic Acids Res.25(17):3389-3402, 1997. When utilizing BLAST® and Gapped BLAST®programs, the default parameters of the respective programs (e.g.,XBLAST® and NBLAST®) can be used, or the parameters can be adjustedappropriately as would be understood by one of ordinary skill in theart.

Another local alignment technique which may be used, for example, isbased on the Smith-Waterman algorithm (Smith, T. F. & Waterman, M. S.(1981) “Identification of common molecular subsequences.” J. Mol. Biol.147:195-197). A general global alignment technique which may be used,for example, is the Needleman-Wunsch algorithm (Needleman, S. B. &Wunsch, C. D. (1970) “A general method applicable to the search forsimilarities in the amino acid sequences of two proteins.” J. Mol. Biol.48:443-453), which is based on dynamic programming.

More recently, a Fast Optimal Global Sequence Alignment Algorithm(FOGSAA) was developed that purportedly produces global alignment ofnucleic acid and amino acid sequences faster than other optimal globalalignment methods, including the Needleman-Wunsch algorithm. In someembodiments, the identity of two polypeptides is determined by aligningthe two amino acid sequences, calculating the number of identical aminoacids, and dividing by the length of one of the amino acid sequences. Insome embodiments, the identity of two nucleic acids is determined byaligning the two nucleotide sequences and calculating the number ofidentical nucleotide and dividing by the length of one of the nucleicacids.

For multiple sequence alignments, computer programs including ClustalOmega (Sievers et al., Mol Syst Biol. 2011 Oct. 11; 7:539) may be used.

It should be appreciated that a sequence, including a nucleic acid oramino acid sequence, may be found to have a specified percent identityto a reference sequence, such as a sequence disclosed in thisapplication and/or recited in the claims, using any method known to oneof ordinary skill in the art. Different algorithms may yield differentpercent identity values for a given set of sequences. The claims of thisapplication should be understood to encompass sequences for whichpercent identity to a reference sequence is calculated using defaultparameters and/or parameters typically used by the skilled artisan for agiven algorithm.

In some embodiments, a sequence, including a nucleic acid or amino acidsequence, is found to have a specified percent identity to a referencesequence, such as a sequence disclosed in this application and/orrecited in the claims when sequence identity is determined using thealgorithm of Karlin and Altschul Proc. Natl. Acad. Sci. USA 87:2264-68,1990, modified as in Karlin and Altschul Proc. Natl. Acad. Sci. USA90:5873-77, 1993 (e.g., BLAST®, NBLAST®, XBLAST® or Gapped BLAST®programs, using default parameters of the respective programs).

In some embodiments, a sequence, including a nucleic acid or amino acidsequence, is found to have a specified percent identity to a referencesequence, such as a sequence disclosed in this application and/orrecited in the claims when sequence identity is determined using theSmith-Waterman algorithm (Smith, T. F. & Waterman, M. S. (1981)“Identification of common molecular subsequences.” J. Mol. Biol.147:195-197) or the Needleman-Wunsch algorithm (Needleman, S. B. &Wunsch, C. D. (1970) “A general method applicable to the search forsimilarities in the amino acid sequences of two proteins.” J. Mol. Biol.48:443-453).

In some embodiments, a sequence, including a nucleic acid or amino acidsequence, is found to have a specified percent identity to a referencesequence, such as a sequence disclosed in this application and/orrecited in the claims when sequence identity is determined using a FastOptimal Global Sequence Alignment Algorithm (FOGSAA).

In some embodiments, a sequence, including a nucleic acid or amino acidsequence, is found to have a specified percent identity to a referencesequence, such as a sequence disclosed in this application and/orrecited in the claims when sequence identity is determined using ClustalOmega (Sievers et al., Mol Syst Biol. 2011 Oct. 11; 7:539).

As used in this application, a residue (such as a nucleic acid residueor an amino acid residue) in sequence “X” is referred to ascorresponding to a position or residue (such as a nucleic acid residueor an amino acid residue) “Z” in a different sequence “Y” when theresidue in sequence “X” is at the counterpart position of “Z” insequence “Y” when sequences X and Y are aligned using amino acidsequence alignment tools known in the art.

As used in this application, variant sequences may be homologoussequences. As used in this application, homologous sequences aresequences (e.g., nucleic acid or amino acid sequences) that share acertain percent identity (e.g., at least 5%, at least 10%, at least 15%,at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, atleast 45%, at least 50%, at least 55%, at least 60%, at least 65%, atleast 70%, at least 71%, at least 72%, at least 73%, at least 74%, atleast 75%, at least 76%, at least 77%, at least 78%, at least 79%, atleast 80%, at least 81%, at least 82%, at least 83%, at least 84%, atleast 85%, at least 86%, at least 87%, at least 88%, at least 89%, atleast 90%, at least 91%, at least 92%, at least 93%, at least 94%, atleast 95%, at least 96%, at least 97%, at least 98%, at least 99%, or100% percent identity, including all values in between). Homologoussequences include but are not limited to paralogous or orthologoussequences. Paralogous sequences arise from duplication of a gene withina genome of a species, while orthologous sequences diverge after aspeciation event.

In some embodiments, a polypeptide variant (e.g., AAE, PKS, PKC, PT, orTS enzyme variant) comprises a domain that shares a secondary structure(e.g., alpha helix, beta sheet) with a reference polypeptide (e.g., areference AAE, PKS, PKC, PT, or TS enzyme). In some embodiments, apolypeptide variant (e.g., AAE, PKS, PKC, PT, or TS enzyme variant)shares a tertiary structure with a reference polypeptide (e.g., areference AAE, PKS, PKC, PT, or TS enzyme). As a non-limiting example, apolypeptide variant (e.g., AAE, PKS, PKC, PT, or TS enzyme) may have lowprimary sequence identity (e.g., less than 80%, less than 75%, less than70%, less than 65%, less than 60%, less than 55%, less than 50%, lessthan 45%, less than 40%, less than 35%, less than 30%, less than 25%,less than 20%, less than 15%, less than 10%, or less than 5% sequenceidentity) compared to a reference polypeptide, but share one or moresecondary structures (e.g., including but not limited to loops, alphahelices, or beta sheets), or have the same tertiary structure as areference polypeptide. For example, a loop may be located between a betasheet and an alpha helix, between two alpha helices, or between two betasheets. Homology modeling may be used to compare two or more tertiarystructures.

Functional variants of the recombinant AAE, PKS, PKC, PT, or TS enzymedisclosed in this application are encompassed by the present disclosure.For example, functional variants may bind one or more of the samesubstrates or produce one or more of the same products. Functionalvariants may be identified using any method known in the art. Forexample, the algorithm of Karlin and Altschul Proc. Natl. Acad. Sci. USA87:2264-68, 1990 described above may be used to identify homologousproteins with known functions.

Putative functional variants may also be identified by searching forpolypeptides with functionally annotated domains. Databases includingPfam (Sonnhammer et al., Proteins. 1997 July; 28(3):405-20) may be usedto identify polypeptides with a particular domain.

Homology modeling may also be used to identify amino acid residues thatare amenable to mutation (e.g., substitution, deletion, and/orinsertion) without affecting function. A non-limiting example of such amethod may include use of position-specific scoring matrix (PSSM) and anenergy minimization protocol.

Position-specific scoring matrix (PSSM) uses a position weight matrix toidentify consensus sequences (e.g., motifs). PSSM can be conducted onnucleic acid or amino acid sequences. Sequences are aligned and themethod takes into account the observed frequency of a particular residue(e.g., an amino acid or a nucleotide) at a particular position and thenumber of sequences analyzed. See, e.g., Stormo et al., Nucleic AcidsRes. 1982 May 11; 10(9):2997-3011. The likelihood of observing aparticular residue at a given position can be calculated. Without beingbound by a particular theory, positions in sequences with highvariability may be amenable to mutation (e.g., substitution, deletion,and/or insertion; e.g., PSSM score ≥0) to produce functional homologs.

PSSM may be paired with calculation of a Rosetta energy function, whichdetermines the difference between the wild-type and the single-pointmutant. The Rosetta energy function calculates this difference as(ΔΔG_(calc)). With the Rosetta function, the bonding interactionsbetween a mutated residue and the surrounding atoms are used todetermine whether an amino acid substitution, deletion, or insertionincreases or decreases protein stability. For example, an amino acidsubstitution, deletion, or insertion that is designated as favorable bythe PSSM score (e.g. PSSM score 20), can then be analyzed using theRosetta energy function to determine the potential impact of the aminoacid substitution, deletion, or insertion on protein stability. Withoutbeing bound by a particular theory, potentially stabilizing amino acidsubstitutions, deletions, or insertions are desirable for proteinengineering (e.g., production of functional homologs). In someembodiments, a potentially stabilizing amino acid substitution,deletion, or insertion has a ΔΔG_(calc) value of less than −0.1 (e.g.,less than −0.2, less than −0.3, less than −0.35, less than −0.4, lessthan −0.45, less than −0.5, less than −0.55, less than −0.6, less than−0.65, less than −0.7, less than −0.75, less than −0.8, less than −0.85,less than −0.9, less than −0.95, or less than −1.0) Rosetta energy units(R.e.u.). See, e.g., Goldenzweig et al., Mol Cell. 2016 Jul. 21;63(2):337-346. Doi: 10.1016/j.molcel.2016.06.012.

In some embodiments, an AAE, PKS, PKC, PT, or TS enzyme coding sequencecomprises an amino acid substitution, deletion, and/or insertion at 1,2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21,22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39,40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57,58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75,76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93,94, 95, 96, 97, 98, 99, 100 or more than 100 positions corresponding toa reference (e.g., AAE, PKS, PKC, PT, or TS enzyme) coding sequence. Insome embodiments, the AAE, PKS, PKC, PT, or TS enzyme coding sequencecomprises an amino acid substitution, deletion, and/or insertion in 1,2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21,22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39,40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57,58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75,76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93,94, 95, 96, 97, 98, 99, 100 or more codons of the coding sequencerelative to a reference (e.g., AAE, PKS, PKC, PT, or TS enzyme) codingsequence. As will be understood by one of ordinary skill in the art, asubstitution, insertion, or deletion within a codon may or may notchange the amino acid that is encoded by the codon due to degeneracy ofthe genetic code. In some embodiments, the one or more substitutions,insertions, or deletions in the coding sequence do not alter the aminoacid sequence of the coding sequence (e.g., AAE, PKS, PKC, PT, or TSenzyme) relative to the amino acid sequence of a reference polypeptide(e.g., AAE, PKS, PKC, PT, or TS enzyme).

In some embodiments, the one or more substitutions, deletions, and/orinsertions in a recombinant AAE, PKS, PKC, PT, or TS enzyme sequencealters the amino acid sequence of the polypeptide (e.g., AAE, PKS, PKC,PT, or TS enzyme) relative to the amino acid sequence of a referencepolypeptide (e.g., AAE, PKS, PKC, PT, or TS enzyme). In someembodiments, the one or more substitutions, insertions, or deletionsalters the amino acid sequence of the recombinant polypeptide (e.g.,AAE, PKS, PKC, PT, or TS enzyme) relative to the amino acid sequence ofa reference polypeptide (e.g., AAE, PKS, PKC, PT, or TS enzyme) andalters (enhances or reduces) an activity of the polypeptide relative tothe reference polypeptide.

The activity (e.g., specific activity) of any of the recombinantpolypeptides described in this application (e.g., AAE, PKS, PKC, PT, orTS enzyme) may be measured using routine methods. As a non-limitingexample, a recombinant polypeptide's activity may be determined bymeasuring its substrate specificity, product(s) produced, theconcentration of product(s) produced, or any combination thereof. Asused in this application, “specific activity” of a recombinantpolypeptide refers to the amount (e.g., concentration) of a particularproduct produced for a given amount (e.g., concentration) of therecombinant polypeptide per unit time.

The skilled artisan will also realize that insertions, substitutions, ordeletions in a recombinant polypeptide (e.g., AAE, PKS, PKC, PT, or TSenzyme) coding sequence may result in conservative amino acidsubstitutions to provide functionally equivalent variants of theforegoing polypeptides, e.g., variants that retain the activities of thepolypeptides. As used in this application, a “conservative amino acidsubstitution” refers to an amino acid substitution that does not alterthe relative charge or size characteristics or functional activity ofthe protein in which the amino acid substitution is made.

In some instances, an amino acid is characterized by its R group (see,e.g., Table 3). For example, an amino acid may comprise a nonpolaraliphatic R group, a positively charged R group, a negatively charged Rgroup, a nonpolar aromatic R group, or a polar uncharged R group.Non-limiting examples of an amino acid comprising a nonpolar aliphatic Rgroup include alanine, glycine, valine, leucine, methionine, andisoleucine. Non-limiting examples of an amino acid comprising apositively charged R group includes lysine, arginine, and histidine.Non-limiting examples of an amino acid comprising a negatively charged Rgroup include aspartate and glutamate. Non-limiting examples of an aminoacid comprising a nonpolar, aromatic R group include phenylalanine,tyrosine, and tryptophan. Non-limiting examples of an amino acidcomprising a polar uncharged R group include serine, threonine,cysteine, proline, asparagine, and glutamine.

Non-limiting examples of functionally equivalent variants ofpolypeptides may include conservative amino acid substitutions in theamino acid sequences of proteins disclosed in this application. As usedin this application “conservative substitution” is used interchangeablywith “conservative amino acid substitution” and refers to any one of theamino acid substitutions provided in Table 2.

In some embodiments, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,16, 17, 18, 19, 20 or more than 20 residues can be changed whenpreparing variant polypeptides. In some embodiments, amino acids arereplaced by conservative amino acid substitutions.

TABLE 2 Conservative Amino Acid Substitutions Original ConservativeAmino Residue R Group Type Acid Substitutions Ala nonpolar aliphatic Rgroup Cys, Gly, Ser Arg positively charged R group His, Lys Asn polaruncharged R group Asp, Gln, Glu Asp negatively charged R group Asn, Gln,Glu Cys polar uncharged R group Ala, Ser Gln polar uncharged R groupAsn, Asp, Glu Glu negatively charged R group Asn, Asp, Gln Gly nonpolaraliphatic R group Ala, Ser His positively charged R group Arg, Tyr, TrpIle nonpolar aliphatic R group Leu, Met, Val Leu nonpolar aliphatic Rgroup Ile, Met, Val Lys positively charged R group Arg, His Met nonpolaraliphatic R group Ile, Leu, Phe, Val Pro polar uncharged R group Phenonpolar aromatic R group Met, Trp, Tyr Ser polar uncharged R group Ala,Gly, Thr Thr polar uncharged R group Ala, Asn, Ser Trp nonpolar aromaticR group His, Phe, Tyr, Met Tyr nonpolar aromatic R group His, Phe, TrpVal nonpolar aliphatic R group Ile, Leu, Met, Thr

Amino acid substitutions in the amino acid sequence of a polypeptide toproduce a recombinant polypeptide (e.g., AAE, PKS, PKC, PT, or TSenzyme) variant having a desired property and/or activity can be made byalteration of the coding sequence of the polypeptide (e.g., AAE, PKS,PKC, PT, or TS enzyme). Similarly, conservative amino acid substitutionsin the amino acid sequence of a polypeptide to produce functionallyequivalent variants of the polypeptide typically are made by alterationof the coding sequence of the recombinant polypeptide (e.g., AAE, PKS,PKC, PT, or TS enzyme).

Mutations (e.g., substitutions, insertions, additions, or deletions) canbe made in a nucleic acid sequence by a variety of methods known to oneof ordinary skill in the art. For example, mutations (e.g.,substitutions, insertions, additions, or deletions) can be made byPCR-directed mutation, site-directed mutagenesis according to the methodof Kunkel (Kunkel, Proc. Nat. Acad. Sci. U.S.A. 82: 488-492, 1985), bychemical synthesis of a gene encoding a polypeptide, by CRISPR, or byinsertions, such as insertion of a tag (e.g., a HIS tag or a GFP tag).Mutations can include, for example, substitutions, insertions,additions, deletions, and translocations, generated by any method knownin the art. Methods for producing mutations may be found in inreferences such as Molecular Cloning: A Laboratory Manual, J. Sambrook,et al., eds., Fourth Edition, Cold Spring Harbor Laboratory Press, ColdSpring Harbor, N.Y., 2012, or Current Protocols in Molecular Biology, F.M. Ausubel, et al., eds., John Wiley & Sons, Inc., New York, 2010.

In some embodiments, methods for producing variants include circularpermutation (Yu and Lutz, Trends Biotechnol. 2011 January; 29(1):18-25).In circular permutation, the linear primary sequence of a polypeptidecan be circularized (e.g., by joining the N-terminal and C-terminal endsof the sequence) and the polypeptide can be severed (“broken”) at adifferent location. Thus, the linear primary sequence of the newpolypeptide may have low sequence identity (e.g., less than 80%, lessthan 75%, less than 70%, less than 65%, less than 60%, less than 55%,less than 50%, less than 45%, less than 40%, less than 35%, less than30%, less than 25%, less than 20%, less than 15%, less than 10%, less orless than 5%, including all values in between) as determined by linearsequence alignment methods (e.g., Clustal Omega or BLAST). Topologicalanalysis of the two proteins, however, may reveal that the tertiarystructure of the two polypeptides is similar or dissimilar. Withoutbeing bound by a particular theory, a variant polypeptide createdthrough circular permutation of a reference polypeptide and with asimilar tertiary structure as the reference polypeptide can sharesimilar functional characteristics (e.g., enzymatic activity, enzymekinetics, substrate specificity or product specificity). In someinstances, circular permutation may alter the secondary structure,tertiary structure or quaternary structure and produce an enzyme withdifferent functional characteristics (e.g., increased or decreasedenzymatic activity, different substrate specificity, or differentproduct specificity). See, e.g., Yu and Lutz, Trends Biotechnol. 2011January; 29(1):18-25.

It should be appreciated that in a protein that has undergone circularpermutation, the linear amino acid sequence of the protein would differfrom a reference protein that has not undergone circular permutation.However, one of ordinary skill in the art would be able to determinewhich residues in the protein that has undergone circular permutationcorrespond to residues in the reference protein that has not undergonecircular permutation by, for example, aligning the sequences anddetecting conserved motifs, and/or by comparing the structures orpredicted structures of the proteins, e.g., by homology modeling.

In some embodiments, an algorithm that determines the percent identitybetween a sequence of interest and a reference sequence described inthis application accounts for the presence of circular permutationbetween the sequences. The presence of circular permutation may bedetected using any method known in the art, including, for example,RASPODOM (Weiner et al., Bioinformatics. 2005 Apr. 1; 21(7):932-7). Insome embodiments, the presence of circulation permutation is correctedfor (e.g., the domains in at least one sequence are rearranged) prior tocalculation of the percent identity between a sequence of interest and asequence described in this application. The claims of this applicationshould be understood to encompass sequences for which percent identityto a reference sequence is calculated after taking into accountpotential circular permutation of the sequence.

Expression of Nucleic Acids in Host Cells

Aspects of the present disclosure relate to recombinant enzymes,functional modifications and variants thereof, as well as their uses.For example, the methods described in this application may be used toproduce cannabinoids and/or cannabinoid precursors. The methods maycomprise using a host cell comprising an enzyme disclosed in thisapplication, cell lysate, isolated enzymes, or any combination thereof.Methods comprising recombinant expression of genes encoding an enzymedisclosed in this application in a host cell are encompassed by thepresent disclosure. In vitro methods comprising reacting one or morecannabinoid precursors or cannabinoids in a reaction mixture with anenzyme disclosed in this application are also encompassed by the presentdisclosure. In some embodiments, the enzyme is a TS.

A nucleic acid encoding any of the recombinant polypeptides (e.g., AAE,PKS, PKC, PT, or TS enzyme) described in this application may beincorporated into any appropriate vector through any method known in theart. For example, the vector may be an expression vector, including butnot limited to a viral vector (e.g., a lentiviral, retroviral,adenoviral, or adeno-associated viral vector), any vector suitable fortransient expression, any vector suitable for constitutive expression,or any vector suitable for inducible expression (e.g., agalactose-inducible or doxycycline-inducible vector).

A vector encoding any of the recombinant polypeptides (e.g., AAE, PKS,PKC, PT, or TS enzyme) described in this application may be introducedinto a suitable host cell using any method known in the art.Non-limiting examples of yeast transformation protocols are described inGietz et al., Yeast transformation can be conducted by the LiAc/SSCarrier DNA/PEG method. Methods Mol Biol. 2006; 313:107-20, which ishereby incorporated by reference in its entirety. Host cells may becultured under any conditions suitable as would be understood by one ofordinary skill in the art. For example, any media, temperature, andincubation conditions known in the art may be used. For host cellscarrying an inducible vector, cells may be cultured with an appropriateinducible agent to promote expression.

In some embodiments, a vector replicates autonomously in the cell. Insome embodiments, a vector integrates into a chromosome within a cell. Avector can contain one or more endonuclease restriction sites that arecut by a restriction endonuclease to insert and ligate a nucleic acidcontaining a gene described in this application to produce a recombinantvector that is able to replicate in a cell. Vectors are typicallycomposed of DNA, although RNA vectors are also available. Cloningvectors include, but are not limited to: plasmids, fosmids, phagemids,virus genomes and artificial chromosomes. As used in this application,the terms “expression vector” or “expression construct” refer to anucleic acid construct, generated recombinantly or synthetically, with aseries of specified nucleic acid elements that permit transcription of aparticular nucleic acid in a host cell (e.g., microbe), such as a yeastcell. In some embodiments, the nucleic acid sequence of a gene describedin this application is inserted into a cloning vector so that it isoperably joined to regulatory sequences and, in some embodiments,expressed as an RNA transcript. In some embodiments, the vector containsone or more markers, such as a selectable marker as described in thisapplication, to identify cells transformed or transfected with therecombinant vector. In some embodiments, a host cell has already beentransformed with one or more vectors. In some embodiments, a host cellthat has been transformed with one or more vectors is subsequentlytransformed with one or more vectors. In some embodiments, a host cellis transformed simultaneously with more than one vector. In someembodiments, the nucleic acid sequence of a gene described in thisapplication is recoded. Recoding may increase production of the geneproduct by at least 10%, at least 15%, at least 20%, at least 25%, atleast 30%, at least 35%, at least 40%, at least 45%, at least 50%, atleast 55%, at least 60%, at least 65%, at least 70%, at least 75%, atleast 80%, at least 85%, at least 90%, at least 95%, or 100%, includingall values in between) relative to a reference sequence that is notrecoded.

In some embodiments, the nucleic acid encoding any of the proteinsdescribed in this application is under the control of regulatorysequences (e.g., enhancer sequences). In some embodiments, a nucleicacid is expressed under the control of a promoter. The promoter can be anative promoter, e.g., the promoter of the gene in its endogenouscontext, which provides normal regulation of expression of the gene.Alternatively, a promoter can be a promoter that is different from thenative promoter of the gene, e.g., the promoter is different from thepromoter of the gene in its endogenous context.

In some embodiments, the promoter is a eukaryotic promoter. Non-limitingexamples of eukaryotic promoters include TDH3, PGK1, PKC1, PDC1, TEF1,TEF2, RPL18B, SSA1, TDH2, PYK1, TPI1, GAL1, GAL10, GAL7, GAL3, GAL2,MET3, MET25, HXT3, HXT7, ACT1, ADH1, ADH2, CUP1-1, ENO2, and SOD1, aswould be known to one of ordinary skill in the art (see, e.g., Addgenewebsite: blog.addgene.org/plasmids-101-the-promoter-region). In someembodiments, the promoter is a prokaryotic promoter (e.g., bacteriophageor bacterial promoter). Non-limiting examples of bacteriophage promotersinclude Plslcon, T3, T7, SP6, and PL. Non-limiting examples of bacterialpromoters include Pbad, PmgrB, Ptrc2, Plac/ara, Ptac, and Pm.

In some embodiments, the promoter is an inducible promoter. As used inthis application, an “inducible promoter” is a promoter controlled bythe presence or absence of a molecule. This may be used, for example, tocontrollably induce the expression of an enzyme. In some embodiments, aninducible promoter linked to a PT and/or a TS may be used to regulateexpression of the enzyme(s), for example to reduce cannabinoidproduction in certain scenarios (e.g., during transport of thegenetically modified organism to satisfy regulatory restrictions incertain jurisdictions, or between jurisdictions where cannabinoids maynot be shipped). In some embodiments, an inducible promoter linked to aCBGAS and/or a TS, the CBGAS and/or TS may be used to regulateexpression of the enzyme(s), for example to reduce cannabinoidproduction in certain scenarios (e.g., during transport of thegenetically modified organism to satisfy regulatory restrictions incertain jurisdictions, or between jurisdictions where cannabinoids maynot be shipped). Non-limiting examples of inducible promoters includechemically regulated promoters and physically regulated promoters. Forchemically regulated promoters, the transcriptional activity can beregulated by one or more compounds, such as alcohol, tetracycline,galactose, a steroid, a metal, an amino acid, or other compounds. Forphysically regulated promoters, transcriptional activity can beregulated by a phenomenon such as light or temperature. Non-limitingexamples of tetracycline-regulated promoters include anhydrotetracycline(aTc)-responsive promoters and other tetracycline-responsive promotersystems (e.g., a tetracycline repressor protein (tetR), a tetracyclineoperator sequence (tetO) and a tetracycline transactivator fusionprotein (tTA)). Non-limiting examples of steroid-regulated promotersinclude promoters based on the rat glucocorticoid receptor, humanestrogen receptor, moth ecdysone receptors, and promoters from thesteroid/retinoid/thyroid receptor superfamily. Non-limiting examples ofmetal-regulated promoters include promoters derived from metallothionein(proteins that bind and sequester metal ions) genes. Non-limitingexamples of pathogenesis-regulated promoters include promoters inducedby salicylic acid, ethylene or benzothiadiazole (BTH). Non-limitingexamples of temperature/heat-inducible promoters include heat shockpromoters. Non-limiting examples of light-regulated promoters includelight responsive promoters from plant cells. In certain embodiments, theinducible promoter is a galactose-inducible promoter. In someembodiments, the inducible promoter is induced by one or morephysiological conditions (e.g., pH, temperature, radiation, osmoticpressure, saline gradients, cell surface binding, or concentration ofone or more extrinsic or intrinsic inducing agents). Non-limitingexamples of an extrinsic inducer or inducing agent include amino acidsand amino acid analogs, saccharides and polysaccharides, nucleic acids,protein transcriptional activators and repressors, cytokines, toxins,petroleum-based compounds, metal containing compounds, salts, ions,enzyme substrate analogs, hormones or any combination.

In some embodiments, the promoter is a constitutive promoter. As used inthis application, a “constitutive promoter” refers to an unregulatedpromoter that allows continuous transcription of a gene. Non-limitingexamples of a constitutive promoter include TDH3, PGK1, PKC1, PDC1,TEF1, TEF2, RPL18B, SSA1, TDH2, PYK1, TPI1, HXT3, HXT7, ACT1, ADH1,ADH2, ENO2, and SOD1.

Other inducible promoters or constitutive promoters, including syntheticpromoters, that may be known to one of ordinary skill in the art arealso contemplated.

The precise nature of the regulatory sequences needed for geneexpression may vary between species or cell types, but generallyinclude, as necessary, 5′ non-transcribed and 5′ non-translatedsequences involved with the initiation of transcription and translationrespectively, such as a TATA box, capping sequence, CAAT sequence, andthe like. In particular, such 5′ non-transcribed regulatory sequenceswill include a promoter region which includes a promoter sequence fortranscriptional control of the operably joined gene. Regulatorysequences may also include enhancer sequences or upstream activatorsequences. The vectors disclosed may include 5′ leader or signalsequences. The regulatory sequence may also include a terminatorsequence. In some embodiments, a terminator sequence marks the end of agene in DNA during transcription. The choice and design of one or moreappropriate vectors suitable for inducing expression of one or moregenes described in this application in a heterologous organism is withinthe ability and discretion of one of ordinary skill in the art.

Expression vectors containing the necessary elements for expression arecommercially available and known to one of ordinary skill in the art(see, e.g., Sambrook et al., Molecular Cloning: A Laboratory Manual,Fourth Edition, Cold Spring Harbor Laboratory Press, 2012).

Host Cells

The disclosed cannabinoid biosynthetic methods and host cells areexemplified with S. cerevisiae, but are also applicable to other hostcells, as would be understood by one of ordinary skill in the art.

Suitable host cells include, but are not limited to: yeast cells,bacterial cells, algal cells, plant cells, fungal cells, insect cells,and animal cells, including mammalian cells. In one illustrativeembodiment, suitable host cells include E. coli (e.g., Shuffle™competent E. coli available from New England BioLabs in Ipswich, Mass.).

Other suitable host cells of the present disclosure includemicroorganisms of the genus Corynebacterium. In some embodiments,preferred Corynebacterium strains/species include: C. efficiens, withthe deposited type strain being DSM44549, C. glutamicum, with thedeposited type strain being ATCC13032, and C. ammoniagenes, with thedeposited type strain being ATCC6871. In some embodiments the preferredhost cell of the present disclosure is C. glutamicum.

Suitable host cells of the genus Corynebacterium, in particular of thespecies Corynebacterium glutamicum, are in particular the knownwild-type strains: Corynebacterium glutamicum ATCC13032, Corynebacteriumacetoglutamicum ATCC15806, Corynebacterium acetoacidophilum ATCC13870,Corynebacterium melassecola ATCC17965, Corynebacterium thermoaminogenesFERM BP-1539, Brevibacterium flavum ATCC14067, Brevibacteriumlactofermentum ATCC13869, and Brevibacterium divaricatum ATCC14020; andL-amino acid-producing mutants, or strains, prepared therefrom, such as,for example, the L-lysine-producing strains: Corynebacterium glutamicumFERM-P 1709, Brevibacterium flavum FERM-P 1708, Brevibacteriumlactofermentum FERM-P 1712, Corynebacterium glutamicum FERM-P 6463,Corynebacterium glutamicum FERM-P 6464, Corynebacterium glutamicumDM58-1, Corynebacterium glutamicum DG52-5, Corynebacterium glutamicumDSM5714, and Corynebacterium glutamicum DSM12866.

Suitable yeast host cells include, but are not limited to: Candida,Hansenula, Saccharomyces, Schizosaccharomyces, Pichia, Kluyveromyces,and Yarrowia. In some embodiments, the yeast cell is Hansenulapolymorpha, Saccharomyces cerevisiae, Saccaromyces carlsbergensis,Saccharomyces diastaticus, Saccharomyces norbensis, Saccharomyceskluyveri, Schizosaccharomyces pombe, Komagataella phaffii, formerlyknown as Pichia pastoris, Pichia finlandica, Pichia trehalophila, Pichiakodamae, Pichia membranaefaciens, Pichia opuntiae, Pichiathermotolerans, Pichia salictaria, Pichia quercuum, Pichia pijperi,Pichia stipitis, Pichia methanolica, Pichia angusta, Kluyveromyceslactis, Candida albicans, or Yarrowia lipolytica.

In some embodiments, the yeast strain is an industrial polyploid yeaststrain. Other non-limiting examples of fungal cells include cellsobtained from Aspergillus spp., Penicillium spp., Fusarium spp.,Rhizopus spp., Acremonium spp., Neurospora spp., Sordaria spp.,Magnaporthe spp., Allomyces spp., Ustilago spp., Botrytis spp., andTrichoderma spp.

In certain embodiments, the host cell is an algal cell such as,Chlamydomonas (e.g., C. Reinhardtii) and Phormidium (P. sp. ATCC29409).

In other embodiments, the host cell is a prokaryotic cell. Suitableprokaryotic cells include gram positive, gram negative, andgram-variable bacterial cells. The host cell may be a species of, butnot limited to: Agrobacterium, Alicyclobacillus, Anabaena, Anacystis,Acinetobacter, Acidothermus, Arthrobacter, Azobacter, Bacillus,Bifidobacterium, Brevibacterium, Butyrivibrio, Buchnera, Campestris,Camplyobacter, Clostridium, Corynebacterium, Chromatium, Coprococcus,Escherichia, Enterococcus, Enterobacter, Erwinia, Fusobacterium,Faecalibacterium, Francisella, Flavobacterium, Geobacillus, Haemophilus,Helicobacter, Klebsiella, Lactobacillus, Lactococcus, Ilyobacter,Micrococcus, Microbacterium, Mesorhizobium, Methylobacterium,Methylobacterium, Mycobacterium, Neisseria, Pantoea, Pseudomonas,Prochlorococcus, Rhodobacter, Rhodopseudomonas, Rhodopseudomonas,Roseburia, Rhodospirillum, Rhodococcus, Scenedesmus, Streptomyces,Streptococcus, Synecoccus, Saccharomonospora, Saccharopolyspora,Staphylococcus, Serratia, Salmonella, Shigella, Thermoanaerobacterium,Tropheryma, Tularensis, Temecula, Thermosynechococcus, Thermococcus,Ureaplasma, Xanthomonas, Xylella, Yersinia, and Zymomonas.

In some embodiments, the bacterial host strain is an industrial strain.Numerous bacterial industrial strains are known and suitable for themethods and compositions described in this application.

In some embodiments, the bacterial host cell is of the Agrobacteriumspecies (e.g., A. radiobacter, A. rhizogenes, A. rubi), theArthrobacterspecies (e.g., A. aurescens, A. citreus, A. globformis, A.hydrocarboglutamicus, A. mysorens, A. nicotianae, A. paraffineus, A.protophonniae, A. roseoparaffinus, A. sulfureus, A. ureafaciens), theBacillus species (e.g., B. thuringiensis, B. anthracis, B. megaterium,B. subtilis, B. lentus, B. circulars, B. pumilus, B. lautus, B.coagulans, B. brevis, B. firmus, B. alkaophius, B. licheniformis, B.clausii, B. stearothermophilus, B. halodurans and B. amyloliquefaciens.In particular embodiments, the host cell will be an industrial Bacillusstrain including but not limited to B. subtilis, B. pumilus, B.licheniformis, B. megaterium, B. clausii, B. stearothermophilus and B.amyloliquefaciens. In some embodiments, the host cell will be anindustrial Clostridium species (e.g., C. acetobutylicum, C. tetani E88,C. lituseburense, C. saccharobutylicum, C. perfringens, C.beijerinckii). In some embodiments, the host cell will be an industrialCorynebacterium species (e.g., C. glutamicum, C. acetoacidophilum). Insome embodiments, the host cell will be an industrial Escherichiaspecies (e.g., E. coli). In some embodiments, the host cell will be anindustrial Erwinia species (e.g., E. uredovora, E. carotovora, E.ananas, E. herbicola, E. punctata, E. terreus). In some embodiments, thehost cell will be an industrial Pantoea species (e.g., P. citrea, P.agglomerans). In some embodiments, the host cell will be an industrialPseudomonas species, (e.g., P. putida, P. aeruginosa, P. mevalonii). Insome embodiments, the host cell will be an industrial Streptococcusspecies (e.g., S. equisimiles, S. pyogenes, S. uberis). In someembodiments, the host cell will be an industrial Streptomyces species(e.g., S. ambofaciens, S. achromogenes, S. avermitilis, S. coelicolor,S. aureofaciens, S. aureus, S. fungicidicus, S. griseus, S. lividans).In some embodiments, the host cell will be an industrial Zymomonasspecies (e.g., Z. mobilis, Z. lipolytica), and the like.

The present disclosure is also suitable for use with a variety of animalcell types, including mammalian cells, for example, human (including293, HeLa, WI38, PER.C6 and Bowes melanoma cells), mouse (including 3T3,NS0, NS1, Sp2/0), hamster (CHO, BHK), monkey (COS, FRhL, Vero), insectcells, for example fall armyworm (including Sf9 and Sf21), silkmoth(including BmN), cabbage looper (including BTI-Tn-5B1-4) and commonfruit fly (including Schneider 2), and hybridoma cell lines.

In various embodiments, strains that may be used in the practice of thedisclosure including both prokaryotic and eukaryotic strains, and arereadily accessible to the public from a number of culture collectionssuch as American Type Culture Collection (ATCC), Deutsche Sammlung vonMikroorganismen and Zellkulturen GmbH (DSM), Centraalbureau VoorSchimmelcultures (CBS), and Agricultural Research Service Patent CultureCollection, Northern Regional Research Center (NRRL). The presentdisclosure is also suitable for use with a variety of plant cell types.In some embodiments, the plant is of the Cannabis genus in the familyCannabaceae. In certain embodiments, the plant is of the speciesCannabis sativa, Cannabis indica, or Cannabis ruderalis. In otherembodiments, the plant is of the genus Nicotiana in the familySolanaceae. In certain embodiments, the plant is of the speciesNicotiana rustica.

The term “cell,” as used in this application, may refer to a single cellor a population of cells, such as a population of cells belonging to thesame cell line or strain. Use of the singular term “cell” should not beconstrued to refer explicitly to a single cell rather than a populationof cells. The host cell may comprise genetic modifications relative to awild-type counterpart. Reduction of gene expression and/or geneinactivation in a host cell may be achieved through any suitable method,including but not limited to, deletion of the gene, introduction of apoint mutation into the gene, selective editing of the gene and/ortruncation of the gene. For example, polymerase chain reaction(PCR)-based methods may be used (see, e.g., Gardner et al., Methods MolBiol. 2014; 1205:45-78). As a non-limiting example, genes may be deletedthrough gene replacement (e.g., with a marker, including a selectionmarker). A gene may also be truncated through the use of a transposonsystem (see, e.g., Poussu et al., Nucleic Acids Res. 2005; 33(12):e104). A gene may also be edited through of the use of gene editingtechnologies known in the art, such as CRISPR-based technologies.

Culturing of Host Cells

Any of the cells disclosed in this application can be cultured in mediaof any type (rich or minimal) and any composition prior to, during,and/or after contact and/or integration of a nucleic acid. Theconditions of the culture or culturing process can be optimized throughroutine experimentation as would be understood by one of ordinary skillin the art. In some embodiments, the selected media is supplemented withvarious components. In some embodiments, the concentration and amount ofa supplemental component is optimized. In some embodiments, otheraspects of the media and growth conditions (e.g., pH, temperature, etc.)are optimized through routine experimentation. In some embodiments, thefrequency that the media is supplemented with one or more supplementalcomponents, and the amount of time that the cell is cultured, isoptimized.

Culturing of the cells described in this application can be performed inculture vessels known and used in the art. In some embodiments, anaerated reaction vessel (e.g., a stirred tank reactor) is used toculture the cells. In some embodiments, a bioreactor or fermentor isused to culture the cell. Thus, in some embodiments, the cells are usedin fermentation. As used in this application, the terms “bioreactor” and“fermentor” are interchangeably used and refer to an enclosure, orpartial enclosure, in which a biological, biochemical and/or chemicalreaction takes place that involves a living organism or part of a livingorganism. A “large-scale bioreactor” or “industrial-scale bioreactor” isa bioreactor that is used to generate a product on a commercial orquasi-commercial scale. Large scale bioreactors typically have volumesin the range of liters, hundreds of liters, thousands of liters, ormore.

Non-limiting examples of bioreactors include: stirred tank fermentors,bioreactors agitated by rotating mixing devices, chemostats, bioreactorsagitated by shaking devices, airlift fermentors, packed-bed reactors,fixed-bed reactors, fluidized bed bioreactors, bioreactors employingwave induced agitation, centrifugal bioreactors, roller bottles, andhollow fiber bioreactors, roller apparatuses (for example benchtop,cart-mounted, and/or automated varieties), vertically-stacked plates,spinner flasks, stirring or rocking flasks, shaken multi-well plates, MDbottles, T-flasks, Roux bottles, multiple-surface tissue culturepropagators, modified fermentors, and coated beads (e.g., beads coatedwith serum proteins, nitrocellulose, or carboxymethyl cellulose toprevent cell attachment).

In some embodiments, the bioreactor includes a cell culture system wherethe cell (e.g., yeast cell) is in contact with moving liquids and/or gasbubbles. In some embodiments, the cell or cell culture is grown insuspension. In other embodiments, the cell or cell culture is attachedto a solid phase carrier. Non-limiting examples of a carrier systemincludes microcarriers (e.g., polymer spheres, microbeads, andmicrodisks that can be porous or non-porous), cross-linked beads (e.g.,dextran) charged with specific chemical groups (e.g., tertiary aminegroups), 2D microcarriers including cells trapped in nonporous polymerfibers, 3D carriers (e.g., carrier fibers, hollow fibers, multicartridgereactors, and semi-permeable membranes that can comprising porousfibers), microcarriers having reduced ion exchange capacity,encapsulation cells, capillaries, and aggregates. In some embodiments,carriers are fabricated from materials such as dextran, gelatin, glass,or cellulose.

In some embodiments, industrial-scale processes are operated incontinuous, semi-continuous or non-continuous modes. Non-limitingexamples of operation modes are batch, fed batch, extended batch,repetitive batch, draw/fill, rotating-wall, spinning flask, and/orperfusion mode of operation. In some embodiments, a bioreactor allowscontinuous or semi-continuous replenishment of the substrate stock, forexample a carbohydrate source and/or continuous or semi-continuousseparation of the product, from the bioreactor.

In some embodiments, the bioreactor or fermentor includes a sensorand/or a control system to measure and/or adjust reaction parameters.Non-limiting examples of reaction parameters include biologicalparameters (e.g., growth rate, cell size, cell number, cell density,cell type, or cell state, etc.), chemical parameters (e.g., pH,redox-potential, concentration of reaction substrate and/or product,concentration of dissolved gases, such as oxygen concentration and CO₂concentration, nutrient concentrations, metabolite concentrations,concentration of an oligopeptide, concentration of an amino acid,concentration of a vitamin, concentration of a hormone, concentration ofan additive, serum concentration, ionic strength, concentration of anion, relative humidity, molarity, osmolarity, concentration of otherchemicals, for example buffering agents, adjuvants, or reactionby-products), physical/mechanical parameters (e.g., density,conductivity, degree of agitation, pressure, and flow rate, shearstress, shear rate, viscosity, color, turbidity, light absorption,mixing rate, conversion rate, as well as thermodynamic parameters, suchas temperature, light intensity/quality, etc.). Sensors to measure theparameters described in this application are well known to one ofordinary skill in the relevant mechanical and electronic arts. Controlsystems to adjust the parameters in a bioreactor based on the inputsfrom a sensor described in this application are well known to one ofordinary skill in the art in bioreactor engineering.

In some embodiments, the method involves batch fermentation (e.g., shakeflask fermentation). General considerations for batch fermentation(e.g., shake flask fermentation) include the level of oxygen andglucose. For example, batch fermentation (e.g., shake flaskfermentation) may be oxygen and glucose limited, so in some embodiments,the capability of a strain to perform in a well-designed fed-batchfermentation is underestimated. Also, the final product (e.g.,cannabinoid or cannabinoid precursor) may display some differences fromthe substrate in terms of solubility, toxicity, cellular accumulationand secretion and in some embodiments can have different fermentationkinetics.

In some embodiments, the cells of the present disclosure are adapted toproduce cannabinoids or cannabinoid precursors in vivo. In someembodiments, the cells are adapted to secrete one or more enzymes forcannabinoid synthesis (e.g., AAE, PKS, PKC, PT, or TS). In someembodiments, the cells of the present disclosure are lysed, and theremaining lysates are recovered for subsequent use. In such embodiments,the secreted or lysed enzyme can catalyze reactions for the productionof a cannabinoid or precursor by bioconversion in an in vitro or ex vivoprocess. In some embodiments, any and all conversions described in thisapplication can be conducted chemically or enzymatically, in vitro or invivo.

Purification and Further Processing

In some embodiments, any of the methods described in this applicationmay include isolation and/or purification of the cannabinoids and/orcannabinoid precursors produced (e.g., produced in a bioreactor). Forexample, the isolation and/or purification can involve one or more ofcell lysis, centrifugation, extraction, column chromatography,distillation, crystallization, and lyophilization.

The methods described in this application encompass production of anycannabinoid or cannabinoid precursor known in the art. Cannabinoids orcannabinoid precursors produced by any of the recombinant cellsdisclosed in this application or any of the in vitro methods describedin this application may be identified and extracted using any methodknown in the art. Mass spectrometry (e.g., LC-MS, GC-MS) is anon-limiting example of a method for identification and may be used toextract a compound of interest.

In some embodiments, any of the methods described in this applicationfurther comprise decarboxylation of a cannabinoid or cannabinoidprecursor. As a non-limiting example, the acid form of a cannabinoid orcannabinoid precursor may be heated (e.g., at least 90° C.) todecarboxylate the cannabinoid or cannabinoid precursor. See, e.g., U.S.Pat. Nos. 10,159,908, 10,143,706, 9,908,832 and 7,344,736. See also,e.g., Wang et al., Cannabis Cannabinoid Res. 2016; 1(1): 262-271.

Compositions, Kits, and Administration

The present disclosure provides compositions, including pharmaceuticalcompositions, comprising a cannabinoid or a cannabinoid precursor, orpharmaceutically acceptable salt thereof, produced by any of the methodsdescribed in this application, and optionally a pharmaceuticallyacceptable excipient.

In certain embodiments, a cannabinoid or cannabinoid precursor describedin this application is provided in an effective amount in a composition,such as a pharmaceutical composition. In certain embodiments, theeffective amount is a therapeutically effective amount. In certainembodiments, the effective amount is a prophylactically effectiveamount.

Compositions, such as pharmaceutical compositions, described in thisapplication can be prepared by any method known in the art. In general,such preparatory methods include bringing a compound described in thisapplication (i.e., the “active ingredient”) into association with acarrier or excipient, and/or one or more other accessory ingredients,and then, if necessary and/or desirable, shaping, and/or packaging theproduct into a desired single- or multi-dose unit.

Pharmaceutical compositions can be prepared, packaged, and/or sold inbulk, as a single unit dose, and/or as a plurality of single unit doses.A “unit dose” is a discrete amount of the pharmaceutical compositioncomprising a predetermined amount of the active ingredient. The amountof the active ingredient is generally equal to the dosage of the activeingredient which would be administered to a subject and/or a convenientfraction of such a dosage, such as one-half or one-third of such adosage.

Relative amounts of the active ingredient, the pharmaceuticallyacceptable excipient, and/or any additional ingredients in apharmaceutical composition described in this application will vary,depending upon the identity, size, and/or condition of the subjecttreated and further depending upon the route by which the composition isto be administered. The composition may comprise between 0.1% and 100%(w/w) active ingredient.

Pharmaceutically acceptable excipients used in the manufacture ofpharmaceutical compositions include inert diluents, dispersing and/orgranulating agents, surface active agents and/or emulsifiers,disintegrating agents, binding agents, preservatives, buffering agents,lubricating agents, and/or oils. Excipients such as cocoa butter andsuppository waxes, coloring agents, coating agents, sweetening,flavoring, and perfuming agents may also be present in the composition.Exemplary excipients include diluents, dispersing and/or granulatingagents, surface active agents and/or emulsifiers, disintegrating agents,binding agents, preservatives, buffering agents, lubricating agents,and/or oils (e.g., synthetic oils, semi-synthetic oils) as disclosed inthis application.

Exemplary diluents include calcium carbonate, sodium carbonate, calciumphosphate, dicalcium phosphate, calcium sulfate, calcium hydrogenphosphate, sodium phosphate lactose, sucrose, cellulose,microcrystalline cellulose, kaolin, mannitol, sorbitol, inositol, sodiumchloride, dry starch, cornstarch, powdered sugar, and mixtures thereof.

Exemplary granulating and/or dispersing agents include potato starch,corn starch, tapioca starch, sodium starch glycolate, clays, alginicacid, guar gum, citrus pulp, agar, bentonite, cellulose, and woodproducts, natural sponge, cation-exchange resins, calcium carbonate,silicates, sodium carbonate, cross-linked poly(vinyl-pyrrolidone)(crospovidone), sodium carboxymethyl starch (sodium starch glycolate),carboxymethyl cellulose, cross-linked sodium carboxymethyl cellulose(croscarmellose), methylcellulose, pregelatinized starch (starch 1500),microcrystalline starch, water insoluble starch, calcium carboxymethylcellulose, magnesium aluminum silicate (Veegum), sodium lauryl sulfate,quaternary ammonium compounds, and mixtures thereof.

Exemplary surface active agents and/or emulsifiers include naturalemulsifiers (e.g., acacia, agar, alginic acid, sodium alginate,tragacanth, chondrux, cholesterol, xanthan, pectin, gelatin, egg yolk,casein, wool fat, cholesterol, wax, and lecithin), colloidal clays(e.g., bentonite (aluminum silicate) and Veegum (magnesium aluminumsilicate)), long chain amino acid derivatives, high molecular weightalcohols (e.g., stearyl alcohol, cetyl alcohol, oleyl alcohol, triacetinmonostearate, ethylene glycol distearate, glyceryl monostearate, andpropylene glycol monostearate, polyvinyl alcohol), carbomers (e.g.,carboxy polymethylene, polyacrylic acid, acrylic acid polymer, andcarboxyvinyl polymer), carrageenan, cellulosic derivatives (e.g.,carboxymethylcellulose sodium, powdered cellulose, hydroxymethylcellulose, hydroxypropyl cellulose, hydroxypropyl methylcellulose,methylcellulose), sorbitan fatty acid esters (e.g., polyoxyethylenesorbitan monolaurate (Tween® 20), polyoxyethylene sorbitan (Tween® 60),polyoxyethylene sorbitan monooleate (Tween® 80), sorbitan monopalmitate(Span® 40), sorbitan monostearate (Span® 60), sorbitan tristearate(Span® 65), glyceryl monooleate, sorbitan monooleate (Span® 80),polyoxyethylene esters (e.g., polyoxyethylene monostearate (Myrj® 45),polyoxyethylene hydrogenated castor oil, polyethoxylated castor oil,polyoxymethylene stearate, and Solutol®), sucrose fatty acid esters,polyethylene glycol fatty acid esters (e.g., Cremophor®),polyoxyethylene ethers, (e.g., polyoxyethylene lauryl ether (Brij® 30)),poly(vinyl-pyrrolidone), diethylene glycol monolaurate, triethanolamineoleate, sodium oleate, potassium oleate, ethyl oleate, oleic acid, ethyllaurate, sodium lauryl sulfate, Pluronic® F-68, poloxamer P-188,cetrimonium bromide, cetylpyridinium chloride, benzalkonium chloride,docusate sodium, and/or mixtures thereof.

Exemplary binding agents include starch (e.g., cornstarch and starchpaste), gelatin, sugars (e.g., sucrose, glucose, dextrose, dextrin,molasses, lactose, lactitol, mannitol, etc.), natural and synthetic gums(e.g., acacia, sodium alginate, extract of Irish moss, panwar gum,ghatti gum, mucilage of isapol husks, carboxymethylcellulose,methylcellulose, ethylcellulose, hydroxyethylcellulose, hydroxypropylcellulose, hydroxypropyl methylcellulose, microcrystalline cellulose,cellulose acetate, poly(vinyl-pyrrolidone), magnesium aluminum silicate(Veegum®), and larch arabogalactan), alginates, polyethylene oxide,polyethylene glycol, inorganic calcium salts, silicic acid,polymethacrylates, waxes, water, alcohol, and/or mixtures thereof.

Exemplary preservatives include antioxidants, chelating agents,antimicrobial preservatives, antifungal preservatives, antiprotozoanpreservatives, alcohol preservatives, acidic preservatives, and otherpreservatives. In certain embodiments, the preservative is anantioxidant. In other embodiments, the preservative is a chelatingagent.

Exemplary antioxidants include alpha tocopherol, ascorbic acid, acorbylpalmitate, butylated hydroxyanisole, butylated hydroxytoluene,monothioglycerol, potassium metabisulfite, propionic acid, propylgallate, sodium ascorbate, sodium bisulfite, sodium metabisulfite, andsodium sulfite.

Exemplary chelating agents include ethylenediaminetetraacetic acid(EDTA) and salts and hydrates thereof (e.g., sodium edetate, disodiumedetate, trisodium edetate, calcium disodium edetate, dipotassiumedetate, and the like), citric acid and salts and hydrates thereof(e.g., citric acid monohydrate), fumaric acid and salts and hydratesthereof, malic acid and salts and hydrates thereof, phosphoric acid andsalts and hydrates thereof, and tartaric acid and salts and hydratesthereof. Exemplary antimicrobial preservatives include benzalkoniumchloride, benzethonium chloride, benzyl alcohol, bronopol, cetrimide,cetylpyridinium chloride, chlorhexidine, chlorobutanol, chlorocresol,chloroxylenol, cresol, ethyl alcohol, glycerin, hexetidine, imidurea,phenol, phenoxyethanol, phenylethyl alcohol, phenylmercuric nitrate,propylene glycol, and thimerosal.

Exemplary antifungal preservatives include butyl paraben, methylparaben, ethyl paraben, propyl paraben, benzoic acid, hydroxybenzoicacid, potassium benzoate, potassium sorbate, sodium benzoate, sodiumpropionate, and sorbic acid.

Exemplary alcohol preservatives include ethanol, polyethylene glycol,phenol, phenolic compounds, bisphenol, chlorobutanol, hydroxybenzoate,and phenylethyl alcohol.

Exemplary acidic preservatives include vitamin A, vitamin C, vitamin E,beta-carotene, citric acid, acetic acid, dehydroacetic acid, ascorbicacid, sorbic acid, and phytic acid.

Other preservatives include tocopherol, tocopherol acetate, deteroximemesylate, cetrimide, butylated hydroxyanisol (BHA), butylatedhydroxytoluened (BHT), ethylenediamine, sodium lauryl sulfate (SLS),sodium lauryl ether sulfate (SLES), sodium bisulfite, sodiummetabisulfite, potassium sulfite, potassium metabisulfite, Glydant®Plus, Phenonip®, methylparaben, Germall® 115, Germaben® II, Neolone®,Kathon®, and Euxyl®.

Exemplary buffering agents include citrate buffer solutions, acetatebuffer solutions, phosphate buffer solutions, ammonium chloride, calciumcarbonate, calcium chloride, calcium citrate, calcium glubionate,calcium gluceptate, calcium gluconate, D-gluconic acid, calciumglycerophosphate, calcium lactate, propanoic acid, calcium levulinate,pentanoic acid, dibasic calcium phosphate, phosphoric acid, tribasiccalcium phosphate, calcium hydroxide phosphate, potassium acetate,potassium chloride, potassium gluconate, potassium mixtures, dibasicpotassium phosphate, monobasic potassium phosphate, potassium phosphatemixtures, sodium acetate, sodium bicarbonate, sodium chloride, sodiumcitrate, sodium lactate, dibasic sodium phosphate, monobasic sodiumphosphate, sodium phosphate mixtures, tromethamine, magnesium hydroxide,aluminum hydroxide, alginic acid, pyrogen-free water, isotonic saline,Ringer's solution, ethyl alcohol, and mixtures thereof.

Exemplary lubricating agents include magnesium stearate, calciumstearate, stearic acid, silica, talc, malt, glyceryl behanate,hydrogenated vegetable oils, polyethylene glycol, sodium benzoate,sodium acetate, sodium chloride, leucine, magnesium lauryl sulfate,sodium lauryl sulfate, and mixtures thereof.

Exemplary natural oils include almond, apricot kernel, avocado, babassu,bergamot, black current seed, borage, cade, camomile, canola, caraway,carnauba, castor, cinnamon, cocoa butter, coconut, cod liver, coffee,corn, cotton seed, emu, eucalyptus, evening primrose, fish, flaxseed,geraniol, gourd, grape seed, hazel nut, hyssop, isopropyl myristate,jojoba, kukui nut, lavandin, lavender, lemon, litsea cubeba, macademianut, mallow, mango seed, meadowfoam seed, mink, nutmeg, olive, orange,orange roughy, palm, palm kernel, peach kernel, peanut, poppy seed,pumpkin seed, rapeseed, rice bran, rosemary, safflower, sandalwood,sasquana, savoury, sea buckthorn, sesame, shea butter, silicone,soybean, sunflower, tea tree, thistle, tsubaki, vetiver, walnut, andwheat germ oils. Exemplary synthetic or semi-synthetic oils include, butare not limited to, butyl stearate, medium chain triglycerides (such ascaprylic triglyceride and capric triglyceride), cyclomethicone, diethylsebacate, dimethicone 360, isopropyl myristate, mineral oil,octyldodecanol, oleyl alcohol, silicone oil, and mixtures thereof. Incertain embodiments, exemplary synthetic oils comprise medium chaintriglycerides (such as caprylic triglyceride and capric triglyceride).

Liquid dosage forms for oral and parenteral administration includepharmaceutically acceptable emulsions, microemulsions, solutions,suspensions, syrups and elixirs. In addition to the active ingredients,the liquid dosage forms may comprise inert diluents commonly used in theart such as, for example, water or other solvents, solubilizing agentsand emulsifiers such as ethyl alcohol, isopropyl alcohol, ethylcarbonate, ethyl acetate, benzyl alcohol, benzyl benzoate, propyleneglycol, 1,3-butylene glycol, dimethylformamide, oils (e.g., cottonseed,groundnut, corn, germ, olive, castor, and sesame oils), glycerol,tetrahydrofurfuryl alcohol, polyethylene glycols and fatty acid estersof sorbitan, and mixtures thereof. Besides inert diluents, the oralcompositions can include adjuvants such as wetting agents, emulsifyingand suspending agents, sweetening, flavoring, and perfuming agents. Incertain embodiments for parenteral administration, the conjugatesdescribed in this application are mixed with solubilizing agents such asCremophor®, alcohols, oils, modified oils, glycols, polysorbates,cyclodextrins, polymers, and mixtures thereof.

Injectable preparations, for example, sterile injectable aqueous oroleaginous suspensions can be formulated according to the known artusing suitable dispersing or wetting agents and suspending agents. Thesterile injectable preparation can be a sterile injectable solution,suspension, or emulsion in a nontoxic parenterally acceptable diluent orsolvent, for example, as a solution in 1,3-butanediol. Among theacceptable vehicles and solvents that can be employed are water,Ringer's solution, U.S.P., and isotonic sodium chloride solution. Inaddition, sterile, fixed oils are conventionally employed as a solventor suspending medium. For this purpose, any bland fixed oil can beemployed including synthetic mono- or di-glycerides. In addition, fattyacids such as oleic acid are used in the preparation of injectables.

The injectable formulations can be sterilized, for example, byfiltration through a bacterial-retaining filter, or by incorporatingsterilizing agents in the form of sterile solid compositions which canbe dissolved or dispersed in sterile water or other sterile injectablemedium prior to use.

In order to prolong the effect of a drug, it is often desirable to slowthe absorption of the drug from subcutaneous or intramuscular injection.This can be accomplished by the use of a liquid suspension ofcrystalline or amorphous material with poor water solubility. The rateof absorption of the drug then depends upon its rate of dissolution,which, in turn, may depend upon crystal size and crystalline form.Alternatively, delayed absorption of a parenterally administered drugform may be accomplished by dissolving or suspending the drug in an oilvehicle.

Compositions for rectal or vaginal administration are typicallysuppositories which can be prepared by mixing the conjugates describedin this application with suitable non-irritating excipients or carrierssuch as cocoa butter, polyethylene glycol, or a suppository wax whichare solid at ambient temperature but liquid at body temperature andtherefore melt in the rectum or vaginal cavity and release the activeingredient.

Solid dosage forms for oral administration include capsules, tablets,pills, powders, and granules. In such solid dosage forms, the activeingredient is mixed with at least one inert, pharmaceutically acceptableexcipient or carrier such as sodium citrate or dicalcium phosphateand/or (a) fillers or extenders such as starches, lactose, sucrose,glucose, mannitol, and silicic acid, (b) binders such as, for example,carboxymethylcellulose, alginates, gelatin, polyvinylpyrrolidinone,sucrose, and acacia, (c) humectants such as glycerol, (d) disintegratingagents such as agar, calcium carbonate, potato or tapioca starch,alginic acid, certain silicates, and sodium carbonate, (e) solutionretarding agents such as paraffin, (f) absorption accelerators such asquaternary ammonium compounds, (g) wetting agents such as, for example,cetyl alcohol and glycerol monostearate, (h) absorbents such as kaolinand bentonite clay, and (i) lubricants such as talc, calcium stearate,magnesium stearate, solid polyethylene glycols, sodium lauryl sulfate,and mixtures thereof. In the case of capsules, tablets, and pills, thedosage form may include a buffering agent.

Solid compositions of a similar type can be employed as fillers in softand hard-filled gelatin capsules using such excipients as lactose ormilk sugar as well as high molecular weight polyethylene glycols and thelike. The solid dosage forms of tablets, dragees, capsules, pills, andgranules can be prepared with coatings and shells such as entericcoatings and other coatings well known in the art of pharmacology. Theymay optionally comprise opacifying agents and can be of a compositionthat they release the active ingredient(s) only, or preferentially, in acertain part of the intestinal tract, optionally, in a delayed manner.Examples of encapsulating compositions which can be used includepolymeric substances and waxes. Solid compositions of a similar type canbe employed as fillers in soft and hard-filled gelatin capsules usingsuch excipients as lactose or milk sugar as well as high molecularweight polethylene glycols and the like.

The active ingredient can be in a micro-encapsulated form with one ormore excipients as noted above. The solid dosage forms of tablets,dragees, capsules, pills, and granules can be prepared with coatings andshells such as enteric coatings, release controlling coatings, and othercoatings well known in the pharmaceutical formulating art. In such soliddosage forms the active ingredient can be admixed with at least oneinert diluent such as sucrose, lactose, or starch. Such dosage forms maycomprise, as is normal practice, additional substances other than inertdiluents, e.g., tableting lubricants and other tableting aids such amagnesium stearate and microcrystalline cellulose. In the case ofcapsules, tablets and pills, the dosage forms may comprise bufferingagents. They may optionally comprise opacifying agents and can be of acomposition that they release the active ingredient(s) only, orpreferentially, in a certain part of the intestinal tract, optionally,in a delayed manner. Examples of encapsulating agents which can be usedinclude polymeric substances and waxes.

Dosage forms for topical and/or transdermal administration of a compounddescribed in this application may include ointments, pastes, creams,lotions, gels, powders, solutions, sprays, inhalants, and/or patches.Generally, the active ingredient is admixed under sterile conditionswith a pharmaceutically acceptable carrier or excipient and/or anyneeded preservatives and/or buffers as can be required. Additionally,the present disclosure contemplates the use of transdermal patches,which often have the added advantage of providing controlled delivery ofan active ingredient to the body. Such dosage forms can be prepared, forexample, by dissolving and/or dispensing the active ingredient in theproper medium. Alternatively or additionally, the rate can be controlledby either providing a rate controlling membrane and/or by dispersing theactive ingredient in a polymer matrix and/or gel.

Suitable devices for use in delivering intradermal pharmaceuticalcompositions described in this application include short needle devices.Intradermal compositions can be administered by devices which limit theeffective penetration length of a needle into the skin. Alternatively oradditionally, conventional syringes can be used in the classical mantouxmethod of intradermal administration. Jet injection devices whichdeliver liquid formulations to the dermis via a liquid jet injectorand/or via a needle which pierces the stratum corneum and produces a jetwhich reaches the dermis are suitable. Ballistic powder/particledelivery devices which use compressed gas to accelerate the compound inpowder form through the outer layers of the skin to the dermis aresuitable.

Formulations suitable for topical administration include, but are notlimited to, liquid and/or semi-liquid preparations such as liniments,lotions, oil-in-water and/or water-in-oil emulsions such as creams,ointments, and/or pastes, and/or solutions and/or suspensions. Topicallyadministrable formulations may, for example, comprise from about 1% toabout 10% (w/w) active ingredient, although the concentration of theactive ingredient can be as high as the solubility limit of the activeingredient in the solvent. Formulations for topical administration mayfurther comprise one or more of the additional ingredients described inthis application.

A pharmaceutical composition described in this application can beprepared, packaged, and/or sold in a formulation suitable for pulmonaryadministration via the buccal cavity. Such a formulation may comprisedry particles which comprise the active ingredient and which have adiameter in the range from about 0.5 to about 7 nanometers, or fromabout 1 to about 6 nanometers. Such compositions are conveniently in theform of dry powders for administration using a device comprising a drypowder reservoir to which a stream of propellant can be directed todisperse the powder and/or using a self-propelling solvent/powderdispensing container such as a device comprising the active ingredientdissolved and/or suspended in a low-boiling propellant in a sealedcontainer. Such powders comprise particles wherein at least 98% of theparticles by weight have a diameter greater than 0.5 nanometers and atleast 95% of the particles by number have a diameter less than 7nanometers. Alternatively, at least 95% of the particles by weight havea diameter greater than 1 nanometer and at least 90% of the particles bynumber have a diameter less than 6 nanometers. Dry powder compositionsmay include a solid fine powder diluent such as sugar and areconveniently provided in a unit dose form.

Low boiling propellants generally include liquid propellants having aboiling point of below 65° F. at atmospheric pressure. Generally, thepropellant may constitute 50 to 99.9% (w/w) of the composition, and theactive ingredient may constitute 0.1 to 20% (w/w) of the composition.The propellant may further comprise additional ingredients such as aliquid non-ionic and/or solid anionic surfactant and/or a solid diluent(which may have a particle size of the same order as particlescomprising the active ingredient).

Although the descriptions of pharmaceutical compositions provided inthis application are principally directed to pharmaceutical compositionswhich are suitable for administration to humans, it will be understoodby the skilled artisan that such compositions are generally suitable foradministration to animals of all sorts. Modification of pharmaceuticalcompositions suitable for administration to humans in order to renderthe compositions suitable for administration to various animals is wellunderstood, and the ordinarily skilled veterinary pharmacologist candesign and/or perform such modification with ordinary experimentation.

Compounds provided in this application are typically formulated indosage unit form for ease of administration and uniformity of dosage. Itwill be understood, however, that the total daily usage of thecompositions described in this application will be decided by aphysician within the scope of sound medical judgment. The specifictherapeutically effective dose level for any particular subject ororganism will depend upon a variety of factors including the diseasebeing treated and the severity of the disorder; the activity of thespecific active ingredient employed; the specific composition employed;the age, body weight, general health, sex, and diet of the subject; thetime of administration, route of administration, and rate of excretionof the specific active ingredient employed; the duration of thetreatment; drugs used in combination or coincidental with the specificactive ingredient employed; and like factors well known in the medicalarts.

The compounds and compositions provided in this application can beadministered by any route, including enteral (e.g., oral), parenteral,intravenous, intramuscular, intra-arterial, intramedullary, intrathecal,subcutaneous, intraventricular, transdermal, interdermal, rectal,intravaginal, intraperitoneal, topical (as by powders, ointments,creams, and/or drops), mucosal, nasal, bucal, sublingual; byintratracheal instillation, bronchial instillation, and/or inhalation;and/or as an oral spray, nasal spray, and/or aerosol. Specificallycontemplated routes are oral administration, intravenous administration(e.g., systemic intravenous injection), regional administration viablood and/or lymph supply, and/or direct administration to an affectedsite. In general, the most appropriate route of administration willdepend upon a variety of factors including the nature of the agent(e.g., its stability in the environment of the gastrointestinal tract),and/or the condition of the subject (e.g., whether the subject is ableto tolerate oral administration).

In some embodiments, compounds or compositions disclosed in thisapplication are formulated and/or administered in nanoparticles.Nanoparticles are particles in the nanoscale. In some embodiments,nanoparticles are less than 1 μm in diameter. In some embodiments,nanoparticles are between about 1 and 100 nm in diameter. Nanoparticlesinclude organic nanoparticles, such as dendrimers, liposomes, orpolymeric nanoparticles. Nanoparticles also include inorganicnanoparticles, such as fullerenes, quantum dots, and gold nanoparticles.Compositions may comprise an aggregate of nanoparticles. In someembodiments, the aggregate of nanoparticles is homogeneous, while inother embodiments the aggregate of nanoparticles is heterogeneous.

The exact amount of a compound required to achieve an effective amountwill vary from subject to subject, depending, for example, on species,age, and general condition of a subject, severity of the side effects ordisorder, identity of the particular compound, mode of administration,and the like. An effective amount may be included in a single dose(e.g., single oral dose) or multiple doses (e.g., multiple oral doses).In certain embodiments, when multiple doses are administered to asubject or applied to a tissue or cell, any two doses of the multipledoses include different or substantially the same amounts of a compounddescribed in this application. In certain embodiments, when multipledoses are administered to a subject or applied to a tissue or cell, thefrequency of administering the multiple doses to the subject or applyingthe multiple doses to the tissue or cell is three doses a day, two dosesa day, one dose a day, one dose every other day, one dose every thirdday, one dose every week, one dose every two weeks, one dose every threeweeks, or one dose every four weeks. In certain embodiments, thefrequency of administering the multiple doses to the subject or applyingthe multiple doses to the tissue or cell is one dose per day. In certainembodiments, the frequency of administering the multiple doses to thesubject or applying the multiple doses to the tissue or cell is twodoses per day. In certain embodiments, the frequency of administeringthe multiple doses to the subject or applying the multiple doses to thetissue or cell is three doses per day. In certain embodiments, whenmultiple doses are administered to a subject or applied to a tissue orcell, the duration between the first dose and last dose of the multipledoses is one day, two days, four days, one week, two weeks, three weeks,one month, two months, three months, four months, six months, ninemonths, one year, two years, three years, four years, five years, sevenyears, ten years, fifteen years, twenty years, or the lifetime of thesubject, tissue, or cell. In certain embodiments, the duration betweenthe first dose and last dose of the multiple doses is three months, sixmonths, or one year. In certain embodiments, the duration between thefirst dose and last dose of the multiple doses is the lifetime of thesubject, tissue, or cell. In certain embodiments, a dose (e.g., a singledose, or any dose of multiple doses) described in this applicationincludes independently between 0.1 μg and 1 μg, between 0.001 mg and0.01 mg, between 0.01 mg and 0.1 mg, between 0.1 mg and 1 mg, between 1mg and 3 mg, between 3 mg and 10 mg, between 10 mg and 30 mg, between 30mg and 100 mg, between 100 mg and 300 mg, between 300 mg and 1,000 mg,or between 1 g and 10 g, inclusive, of a compound described in thisapplication. In certain embodiments, a dose described in thisapplication includes independently between 1 mg and 3 mg, inclusive, ofa compound described in this application. In certain embodiments, a dosedescribed in this application includes independently between 3 mg and 10mg, inclusive, of a compound described in this application. In certainembodiments, a dose described in this application includes independentlybetween 10 mg and 30 mg, inclusive, of a compound described in thisapplication. In certain embodiments, a dose described in thisapplication includes independently between 30 mg and 100 mg, inclusive,of a compound described in this application.

Dose ranges as described in this application provide guidance for theadministration of provided pharmaceutical compositions to an adult. Theamount to be administered to, for example, a child or an adolescent canbe determined by a medical practitioner or person skilled in the art andcan be lower or the same as that administered to an adult.

A compound or composition, as described in this application, can beadministered in combination with one or more additional pharmaceuticalagents (e.g., therapeutically and/or prophylactically active agents).The compounds or compositions can be administered in combination withadditional pharmaceutical agents that improve their activity, improvebioavailability, improve safety, reduce drug resistance, reduce and/ormodify metabolism, inhibit excretion, and/or modify distribution in asubject or cell. It will also be appreciated that the therapy employedmay achieve a desired effect for the same disorder, and/or it mayachieve different effects. In certain embodiments, a pharmaceuticalcomposition described in this application including a compound describedin this application and an additional pharmaceutical agent shows asynergistic effect that is absent in a pharmaceutical compositionincluding one of the compound and the additional pharmaceutical agent,but not both.

The compound or composition can be administered concurrently with, priorto, or subsequent to one or more additional pharmaceutical agents, whichmay be useful as, e.g., combination therapies. Pharmaceutical agentsinclude therapeutically active agents. Pharmaceutical agents alsoinclude prophylactically active agents. Pharmaceutical agents includesmall organic molecules such as drug compounds (e.g., compounds approvedfor human or veterinary use by the U.S. Food and Drug Administration asprovided in the Code of Federal Regulations (CFR)), peptides, proteins,carbohydrates, monosaccharides, oligosaccharides, polysaccharides,nucleoproteins, mucoproteins, lipoproteins, synthetic polypeptides orproteins, small molecules linked to proteins, glycoproteins, steroids,nucleic acids, DNAs, RNAs, nucleotides, nucleosides, oligonucleotides,antisense oligonucleotides, lipids, hormones, vitamins, and cells. Incertain embodiments, the additional pharmaceutical agent is apharmaceutical agent useful for treating and/or preventing a disease(e.g., proliferative disease, neurological disease, painful condition,psychiatric disorder, or metabolic disorder). Each additionalpharmaceutical agent may be administered at a dose and/or on a timeschedule determined for that pharmaceutical agent. The additionalpharmaceutical agents may also be administered together with each otherand/or with the compound or composition described in this application ina single dose or administered separately in different doses. Theparticular combination to employ in a regimen will take into accountcompatibility of the compound described in this application with theadditional pharmaceutical agent(s) and/or the desired therapeutic and/orprophylactic effect to be achieved. In general, it is expected that theadditional pharmaceutical agent(s) in combination be utilized at levelsthat do not exceed the levels at which they are utilized individually.In some embodiments, the levels utilized in combination will be lowerthan those utilized individually.

In some embodiments, one or more of the compositions described in thisapplication are administered to a subject. In certain embodiments, thesubject is an animal. The animal may be of either sex and may be at anystage of development. In certain embodiments, the subject is a human. Inother embodiments, the subject is a non-human animal. In certainembodiments, the subject is a mammal. In certain embodiments, thesubject is a non-human mammal. In certain embodiments, the subject is adomesticated animal, such as a dog, cat, cow, pig, horse, sheep, orgoat. In certain embodiments, the subject is a companion animal, such asa dog or cat. In certain embodiments, the subject is a livestock animal,such as a cow, pig, horse, sheep, or goat. In certain embodiments, thesubject is a zoo animal. In another embodiment, the subject is aresearch animal, such as a rodent (e.g., mouse, rat), dog, pig, ornon-human primate.

Also encompassed by the disclosure are kits (e.g., pharmaceuticalpacks). The kits provided may comprise a composition, such as apharmaceutical composition, or a compound described in this applicationand a container (e.g., a vial, ampule, bottle, syringe, and/or dispenserpackage, or other suitable container). In some embodiments, providedkits may optionally further include a second container comprising apharmaceutical excipient for dilution or suspension of a pharmaceuticalcomposition or compound described in this application. In someembodiments, the pharmaceutical composition or compound described inthis application provided in the first container and the secondcontainer a combined to form one unit dosage form.

Thus, in one aspect, provided are kits including a first containercomprising a compound or composition described in this application. Incertain embodiments, the kits are useful for treating a disease in asubject in need thereof. In certain embodiments, the kits are useful forpreventing a disease in a subject in need thereof. In certainembodiments, the kits are useful for reducing the risk of developing adisease in a subject in need thereof.

In certain embodiments, a kit described in this application furtherincludes instructions for using the kit. A kit described in thisapplication may also include information as required by a regulatoryagency such as the U.S. Food and Drug Administration (FDA). In certainembodiments, the information included in the kits is prescribinginformation. In certain embodiments, the kits and instructions providefor treating a disease in a subject in need thereof. In certainembodiments, the kits and instructions provide for preventing a diseasein a subject in need thereof. In certain embodiments, the kits andinstructions provide for reducing the risk of developing a disease in asubject in need thereof. A kit described in this application may includeone or more additional pharmaceutical agents described in thisapplication as a separate composition.

The present invention is further illustrated by the following Examples,which in no way should be construed as limiting. The entire contents ofall of the references (including literature references, issued patents,published patent applications, and co-pending patent applications) citedthroughout this application are hereby expressly incorporated byreference. If a reference incorporated in this application contains aterm whose definition is incongruous or incompatible with the definitionof same term as defined in the present disclosure, the meaning ascribedto the term in this disclosure shall govern. However, mention of anyreference, article, publication, patent, patent publication, and patentapplication cited in this application is not, and should not be taken asan acknowledgment or any form of suggestion that they constitute validprior art or form part of the common general knowledge in any country inthe world.

EXAMPLES Example 1: Functional Expression of AAE Genes in E. coli and S.cerevisiae

It was reported previously that S. cerevisiae has endogenous AAEactivity that allows conversion of hexanoate to hexanoyl-CoA (Gagne etal. 2012). However, in some embodiments, the endogenous AAE activity ofS. cerevisiae may be insufficient for industrial-scale synthesis ofdownstream products. This example validates novel genes with AAEactivity that can be used in the cells, reactions, and methods of thepresent disclosure.

Several DNA sequences with predicted AAE functionality were identifiedfrom the genomes of the yeast Yarrowia lipolytica (Y. lipolytica) andthe bacterium Rhodopseudomonas palustris (R. palustris). The predictedAAE genes were first codon-optimized in silico for expression in E.coli. The codon-optimized gene sequences were synthesized via standardDNA synthesis techniques and were expressed in recombinant E. coli hostcells (FIG. 4). Lysates from the recombinant E. coli host cells werethen tested for AAE activity using an assay described below.

FIG. 4 shows the results from the AAE activity assay in E. coli hostcells. 3 out of 4 predicted Y. lipolytica AAEs (strains t49578, t49594,and t51477) and both of the predicted R. palustris AAEs (strains t55127and t55128) exhibited activity on a hexanoate substrate. Strains t49594and t51477 expressed the candidate AAE enzyme as a fusion protein withan N-terminal MYC tag. In addition, the assays also showed that 2 out of4 predicted Y. lipolytica AAEs (strains t49594 and t51477) and both ofthe predicted R. palustris AAEs (strains t55127 and t55128) alsodemonstrated activity on a butyrate substrate.

The newly described AAEs were also found to be capable of exhibiting AAEactivity in eukaryotes. Briefly, the Y. lipolytica AAE that produced thebest results in E. coli host cells was selected. This corresponded tothe AAE expressed by strain t49594 (which encodes a proteincorresponding to the protein provided by Uniprot Accession No. Q6C577with a N-terminal MYC tag). The gene encoding this AAE wascodon-optimized for expression in S. cerevisiae, and the last threeresidues (peroxisomal targeting signal 1) were removed. Two differentcodon-optimized versions of this AAE were synthesized in the replicativeyeast expression vector shown in FIGS. 5A-5B. The recoded sequences onlyshared 81.66% sequence identity at the DNA level, while encoding for thesame polypeptide. Both AAE expression constructs were then transformedinto a CEN.PK S. cerevisiae strain, and transformants were selectedbased on ability to grow on media lacking uracil. The transformants weretested for AAE activity with a colorimetric AAE assay (described below).An S. cerevisiae strain expressing GFP was used as a negative control(strain t390338).

The results from the colorimetric AAE assay are shown in FIG. 6. Both ofthe codon-optimized versions of the Y. lipolytica AAE (strains t392878and t392879) exhibited AAE activity on a hexanoate substrate,demonstrating that the newly disclosed AAEs could also be used ineukaryotic hosts. These enzymes were thus demonstrated to be capable ofcatalyzing the first enzymatic step in microbial production ofcannabinoids from carboxylic acids.

This Example demonstrates identification of AAEs that are capable ofusing hexanoate and butyrate as substrates to produce cannabinoidprecursors. Detailed results for the AAE activity experiments in E. colihost cells are provided below in Table 3. Sequence information forstrains described in this Example are provided in Table 4 at the end ofthe Examples section.

TABLE 3 Activity of AAE Enzymes on Hexanoate and Butyrate in E. ColiStandard Standard Average Deviation Average Deviation Strain Activity onActivity on Activity on Activity on (E. coli) Hexanoate HexanoateButyrate Butyrate t49568 −0.0495 0.014849 −0.0105 0.04879 (Negativecontrol) t49578 0.222 0.016971 −0.0195 0.010607 t49580 −0.065 0.019799−0.055 0.007071 t49594 0.458 0.005657 0.3895 0.000707 t51477 0.3470.005657 0.046 0.005657 t55127 0.395 0.011314 0.1835 0.024749 t551280.2495 0.012021 0.2205 0.012021

Materials and Methods

AAE Assay for E. coli

E. coli BL21 strains harboring a plasmid that contained AAE genes drivenby a T7 promoter were inoculated from glycerol stocks into shake flaskswith 25 mL LB and grown overnight at 37° C. with shaking at 250 RPM. Thenext day, strains were inoculated 1% (v/v) into LB and grown for 3-6hours until an OD600 of ˜0.6 was attained. They were then induced with 1mM IPTG and incubated overnight at 23° C. with shaking at 250 RPM. Thenext day, the cultures were harvested and pelleted. Cell pellets werelysed with BugBuster™ reagent (5 mL per g wet pellet) at 18° C. andshaken at 250 RPM for 20 min. Lysates were centrifuged at 4° C. and 4000RPM for 20 min. The soluble fractions of the lysates were taken for theenzyme assay. The enzyme assay mixture contained 5 mM substrate (sodiumhexanoate or sodium butyrate), 3 mM ATP, 1 mM CoA, 5 mM MgCl₂, and 100mM HEPES (pH 7.5). 25 μL of E. coli lysates were added to 500 μL ofassay mixture and allowed to react at 30° C., 250 RPM for 20 min. Assayswere then quenched by adding 50 μL of the reaction to 50 μL of 2 mMDTNB. Absorbance was measured at 412 nm to quantify the decrease in freeCoA.

AAE Assay for S. cerevisiae

5 μL/well of thawed glycerol stocks were stamped into 300 μL/well ofSC-URA+4% dextrose in half-height deepwell plates, which were sealedwith AeraSeal™ film. Samples were incubated at 30° C. and shaken at 1000RPM in 80% humidity for 2 days. 10 μL/well of resulting precultures werestamped into 300 μL/well of SC-URA+4% dextrose in half-height deepwellplates, which were sealed with AeraSeal™ film. Samples were incubated at30° C. and shaken at 1000 RPM in 80% humidity for 3 days. 10 μL ofresulting production cultures were stamped into 140 μL/well PBS in flatbottom plates. Optical measurements were taken on a plate reader, withabsorbance measured at 600 nm and fluorescence at 528 nm with 485 mnexcitation.

Production culture plates were centrifuged at 4000 RPM for 10 min.Supernatant was removed, and the plates of pellets were heat-sealed andfrozen at −80° C.

Pellets were thawed and 200 μL Y-PER per well was added. Samples wereagitated at room temperature for 20 minutes and then pelleted at 3500RPM for 10 minutes. 50 μL of the clarified lysate was combined with 50μL of feed buffer or CoA standard in clear bottom plates. Plates werethen incubated at 30° C. and shaken at 1000 RPM in 80% humidity for 60min. 1 μL of DTNB buffer was added to each well to a final concentrationof 100 μM DTNB, and samples were agitated at room temperature for 15minutes. Absorbance was measured at 412 nm to quantify the decrease infree CoA.

Materials included:

-   -   Feed Buffers:        -   10 mM MgCl₂        -   1 mM sodium hexanoate        -   0.5 mM CoA        -   1 mM ATP        -   100 mM Tris HCl pH 7.6    -   DTNB Buffer:        -   10 mM DTNB (Sigma D8130) (stock of DTNB in DMSO)        -   100 mM Tris-HCl pH7.6 (Teknova)    -   Y-PER Yeast Protein Extraction Reagent (Thermo 78990):        -   +1 tablet/50 mL complete, EDTA-free, protease inhibitors            (Sigma, 11873580001)    -   Coenzyme A trilithium salt (Sigma C3019)

Example 2: Functional Expression of OLS Genes in S. cerevisiae

Functional expression of C. sativa olivetol synthase (OLS) andolivetolic acid cyclase (OAC) enzymes in S. cerevisiae was previouslyreported (Gagne et al. 2012). To identify other OLS genes that can befunctionally expressed, a library of approximately 2000 OLS candidategenes was designed. The genes within the library were codon-optimizedfor expression in S. cerevisiae and synthesized in the replicative yeastexpression vector shown in FIGS. 5A-5B. Each candidate OLS wastransformed into an auxotrophic S. cerevisiae CEN.PK GAL80 knockoutstrain, and transformants were selected based on ability to grow onmedia lacking uracil. The transformants were tested for olivetol andolivetolic acid production from sodium hexanoate in vivo in ahigh-throughput primary screen, as described in the materials andmethods section below. Top olivetol and/or olivetolic acid-producingstrains that were identified in the primary screen were subsequentlytested in a secondary screen to verify and further quantify olivetol andolivetolic acid production.

Numerous yeast transformants were observed to be capable of producingolivetol in the primary screen (FIG. 8). In particular, two of the topolivetol-producing strains were strain t395094 and strain t393991 (FIG.8). These two strains were also found to be among the topolivetol-producing strains in the secondary screen.

When the OLS library described in this Example was designed and screenedin the primary and secondary screens, it was expected that the strainsexpressed full-length candidate OLS enzymes. Specifically, straint395094 was believed to express a full-length OLS protein from Araucariacunninghamii (Hoop pine) (corresponding to Uniprot Accession No.A0A0D6QTX3) and strain t393991 was believed to express a full-length OLSprotein from Cymbidium hybrid cultivar (corresponding to UniprotAccession No. A0A088G5Z5). However, as explained further in Table 5 atthe end of the Examples section, sequencing analysis of strains from theOLS library used for these screens later revealed that there was a6-nucleotide deletion in the sequences of many of the genes encoding theOLS enzymes in the library. Specifically, this deletion affected all ofthe candidate OLSs expressed by the strains identified in FIGS. 8-10,including the candidate OLSs expressed by strains t395094 and t393991(Table 5).

The 6-nucleotide deletion included the first two nucleotides within thestart codon of the genes encoding the OLS enzymes. As one of ordinaryskill in the art would appreciate, such a deletion may result in thetruncation of one or more amino acids from the N-terminus of theproteins encoded by the affected genes, and such a deletion couldpotentially extend to the next in-frame methionine residue in theintended protein sequence. For example, strain t393991 expressed atruncated version of a codon-optimized nucleic acid encoding an OLSprotein from Cymbidium hybrid cultivar (Table 5). The full-lengthCymbidium hybrid cultivar protein corresponds to SEQ ID NO: 7. If thedeletion in the nucleic acid encoding this OLS protein were to result intranslation commencing from the next start codon within the same readingframe, this would result in an N-terminally truncated version of thefull-length OLS protein from Cymbidium hybrid cultivar. A proteinsequence for a truncated protein that commences from the next startcodon within the same reading frame is provided by SEQ ID NO: 714. SEQID NO: 714 has a truncation of the first 86 amino acids of SEQ ID NO: 7and is approximately 77.9% identical to SEQ ID NO: 7.

Due to the truncation of OLS candidate genes within the library,candidate OLS genes screened in this Example were independently screenedagain using a new library that expressed only full-length OLS genes(Example 3). As discussed in Example 3, screening with a full-length OLSlibrary independently identified both the OLS protein from Araucariacunninghamii (Hoop pine) (corresponding to Uniprot Accession No.A0A0D6QTX3) and the OLS protein from Cymbidium hybrid cultivar(corresponding to Uniprot Accession No. A0A088G5Z5), discussed above,verifying the identification of these candidate OLSs as being highlyeffective for olivetol production in recombinant host cells.

It was determined that the OLS enzymes expressed by positive controlstrains t339579 and t339582, depicted in FIGS. 8-10, were also affectedby the 6-nucleotide deletion discussed above. Accordingly, the lowamounts of olivetol and olivetolic acid produced by the strains labelledas positive controls in FIGS. 8-10 may have been caused by disruptedexpression of these proteins due to truncation.

Identification of Bifunctional PKS-PKC Enzymes

It was previously observed that S. cerevisiae possesses native OACactivity which enables some amount of the OLS product,3,5,7-trioxododecanoyl-CoA, to be converted to olivetolic acid insteadof undergoing a spontaneous decarboxylative cyclization to olivetol inthe absence of OAC activity (FIG. 1). Most strains tested in the primaryscreen were observed to produce a constant (i.e., fixed) ratio ofolivetolic acid to olivetol (FIG. 10). Without wishing to be bound byany theory, the accumulation of olivetolic acid and olivetol in thesestrains may be due to the reported endogenous S. cerevisiae OAC activitycompeting with spontaneous conversion of 3,5,7-trioxododecanoyl-CoA toolivetol. Both products, olivetol and olivetolic acid, increaseproportionally with their shared precursor.

However, multiple strains were identified in the primary screen thatdemonstrated olivetolic acid production outside of the constantolivetolic acid to olivetol ratio discussed above. Strain t393974demonstrated the highest olivetolic acid production (FIG. 9). Inparticular, strain t393974 was observed to produce substantially moreolivetolic acid than olivetol in the primary screen a quantity ofolivetolic acid that was outside of the fixed ratio exhibited by othertested strains (FIG. 10). These data suggested that the OLS enzymeexpressed by strain t393974 may be a bifunctional enzyme possessing bothpolyketide synthase and polyketide cyclase catalytic functions and maybe capable of catalyzing both reactions R2 and R3 in FIG. 2, and, atleast, both reactions R2a and R3a in FIG. 1 (“Bifunctional PKS-PKC”).

As discussed above, when the OLS library described in this Example wasdesigned and screened in the primary and secondary screens, it wasexpected that the strains expressed full-length candidate OLS enzymes.Specifically, strain t393974 was believed to express a full-length OLSprotein from Corchorus olitorius (Jute) (corresponding to UniProtAccession No. A0A1R3HSU5). However, as discussed above and explainedfurther in Table 5 at the end of the Examples section, sequencinganalysis of strains from the OLS library used for these screens laterrevealed that there was a 6-nucleotide deletion in the sequences of manyof the genes encoding the OLS enzymes in the library, which affected allof the candidate OLSs expressed by the strains identified in FIGS. 8-10,including the candidate OLS expressed by strain t393974 (Table 5).Accordingly, strain t393974 expressed a truncated version of acodon-optimized nucleic acid encoding an OLS protein from Corchorusolitorius (Jute) (Table 5). The full-length Corchorus olitorius (Jute)protein corresponds to SEQ ID NO: 6.

Due to the sequence truncation of OLS candidate genes within thelibrary, candidate OLS genes screened in this Example were independentlyscreened again using a new library that expressed only full-length OLSgenes (Example 3). As discussed in Example 3, screening with afull-length OLS library also independently identified the candidate OLSprotein from Corchorus olitorius (Jute) (corresponding to UniProtAccession No. A0A1R3HSU5; SEQ ID NO: 6) as an OLS that produced botholivetol and olivetolic acid.

Materials and Methods OLS Assay

A library of approximately 2000 OLS enzymes was transformed into S.cerevisiae. 5 μL/well of thawed glycerol stocks were stamped into 300μL/well of SC-URA+4% dextrose in half-height deepwell plates, which weresealed with AeraSeal™ films. Samples were incubated at 30° C. and shakenat 1000 RPM in 80% humidity for 2 days. 10 μL/well of the resultingprecultures were stamped into 300 μL/well of SC-URA+4% Dextrose+1 mMsodium hexanoate in half-height deepwell plates, which were sealed withAeraSeal™ films. Samples were incubated at 30° C. and shaken at 1000 RPMin 80% humidity for 4 days. 10 μL/well of the resulting productioncultures were stamped into 140 μL/well PBS in flat bottom plates.Optical measurements were taken on a plate reader, with absorbancemeasured at 600 nm and fluorescence at 528 nm with 485 mn excitation.

30 μL/well of production cultures were stamped into 270 μL/well of 100%methanol containing 300 μg/L 3-(3-Hydroxypropyl)phenol (3HPP) inhalf-height deepwell plates. Plates were heat sealed and frozen at −80°C. for two hours. Plates were then thawed for 30 minutes and spun downat 4° C. at 4000 rpm for 10 min. 75 μL of supernatant from each well ofeach plate was stamped into Corning 3694 (half area) plates, which werethen submitted for LC-MS quantification of olivetol and olivetolic acid.

The experimental protocol for the secondary screen was the same asdescribed above, except that four replicates per strain were tested andstandard curves of both olivetol and olivetolic acid were prepared sothat both products could be quantified.

Example 3: Generation and Screening of Full-Length OLS Library

As discussed in Example 2, sequencing analysis of strains from the OLSlibrary used for the screening described in Example 2 revealed that manyof the genes encoding the OLS enzymes in the library were inadvertentlytruncated N-terminally. This truncation affected all of the strainsidentified in FIGS. 8-10, including the positive control strains.

Accordingly, a new OLS library was generated to contain only OLS genesthat produce full-length OLS enzymes. The full-length OLS librarycontained full-length versions of the approximately 2000 OLS enzymesfrom the original library described in Example 2 and also includedapproximately 900 additional candidate OLS enzymes. All candidate OLSenzymes were codon optimized for expression in S. cerevisiae. Straint527340, comprising an OLS from C. sativa, was included in the libraryas a positive control, and strain t527338, comprising GFP, was includedin the library as a negative control. A high-throughput primary screenwas conducted with the full-length OLS library using the same OLS assaydescribed in Example 2 with the following exceptions: all librarymembers and controls were transformed into an auxotrophic CEN.PK strainthat comprised a chromosomally integrated heterologous gene encoding AAEVcsA (Uniprot Accession Q6N4N8) from R. palustris; and neither sodiumhexanoate nor sodium butyrate were included in the production cultures.

Top olivetol and/or olivetolic acid-producing strains from thehigh-throughput primary screen were carried over to a secondary screento verify and further quantify olivetol production. The experimentalprotocol for the secondary screen was the same as the primary screenexcept that: four replicates per strain were tested; and one set ofcultures was supplemented with 1 mM sodium hexanoate, while the secondset of cultures was not supplemented with sodium hexanoate. Olivetolproduction was normalized to a positive control strain expressing a C.sativa OLS. Table 7 provides results for strains that exhibited averagenormalized olivetol >1 and/or that produced higher amounts of olivetolicacid than the positive control in samples that were supplemented withsodium hexanoate. FIG. 12A depicts olivetol production in librarystrains supplemented with sodium hexanoate. FIG. 12B depicts olivetolicacid production in library strains supplemented with sodium hexanoate.Table 8 provides results for strains that exhibited average normalizedolivetol >1 and/or that produced higher amounts of olivetolic acid thanthe positive control in samples that were not supplemented with sodiumhexanoate. Table 6 provides sequence information for strains describedin Tables 7 and 8. The Average Normalized Olivetol for each strain wascalculated by taking the mean of the following ratio for each replicateof that strain: the ratio of olivetol to absorbance measured at 600 nmto average olivetol produced by the C. sativa OLS positive control inthe same plate.

One of the top olivetol-producing strains identified in this Example wasstrain t527346, comprising an OLS enzyme from Cymbidium hybrid cultivar(Accession ID: A0A088G5Z5; SEQ ID NO: 7; Tables 6-8). As discussedabove, this candidate OLS was also identified in the screening conductedin Example 2. Protein alignments conducted with BLASTP using defaultparameters identified two other OLS candidates that shared at least 90%identity with the OLS enzyme from Cymbidium hybrid cultivar (AccessionID: A0A088G5Z5; SEQ ID NO: 7). These strains were: strain t598916(expressing an OLS corresponding to SEQ ID NO: 145) and strain t599231(expressing an OLS corresponding to SEQ ID NO: 15) (Tables 6-8), whichwere 93.07% and 91.79%, identical to SEQ ID NO: 7, respectively.

Another notable olivetol-producing strain identified in this Example wasstrain t599285, comprising an OLS enzyme from Araucaria cunninghamii(Accession A0A0D6QTX3; SEQ ID NO: 17). As discussed above, thiscandidate OLS was also identified in the screening conducted in Example2.

Consistent with the identification in Example 2 of a candidate OLS fromCorchorus olitorius (Jute) as a potentially bifunctional OLS, straint598084, comprising a full-length version of the Corchorus olitorius(Jute) OLS (corresponding to UniProt Accession No. A0A1R3HSU5; SEQ IDNO: 6) was independently identified in this Example as a candidate OLSthat produced more olivetolic acid than a Cannabis OLS positive control,and produced more olivetolic acid than olivetol, based on averageolivetol and olivetolic acid produced (Tables 6-8).

Example 4: Generation and Screening of C. sativa OLS Protein EngineeringLibrary in S. cerevisiae

To identify C. sativa OLS (CsOLS) enzyme variants with improved olivetolproduction, a library comprised of approximately 1300 members wasdesigned. The library included CsOLS enzymes containing single ormultiple amino acid substitutions or deletions. Nucleotide sequenceswere codon-optimized for expression in S. cerevisiae and synthesized inthe replicative yeast expression vector shown in FIGS. 5A-5B. Eachcandidate enzyme expression construct was transformed into anauxotrophic S. cerevisiae CEN.PK GAL80 knockout strain. Transformantswere selected based on ability to grow on media lacking uracil. Straint346317, carrying GFP, was included in the library as a negativecontrol.

The library of candidate CsOLS enzyme variants was assayed for activityin a high-throughput primary screen using the OLS assay described inExample 2 (FIG. 13). LC-MS analysis revealed that approximately 95% oflibrary members produced measurable amounts of olivetol.

The top olivetol and/or olivetolic acid-producing strains from theprimary screen were carried over to a secondary screen to verify theresults. The experimental protocol for the secondary screen was the sameas the primary screen, except that four replicates per strain weretested and a standard curve for olivetol and olivetolic acid wasgenerated so that the amount of olivetol and olivetolic acid could bequantified via LC-MS (FIG. 13.)

Multiple OLS variants were identified that were capable of producingolivetol (Table 9). In order to investigate where the point mutations inthe OLS variants were located relative to the OLS enzyme structure, a 3Dmodel of the wildtype C. sativa OLS protein (corresponding to SEQ ID NO:5) was generated using Rosetta protein modeling software. The activesite of the C. sativa OLS enzyme was identified based on the catalytictriad of residues described in Taura et al. (2009) FEBS Letters for OLSenzymes, consisting of residues H297, C157, and N330 in the C. sativaOLS enzyme. The active site was also defined to include a dockedmolecule of hexanoyl-CoA (OLS substrate). Residues were considered to bewithin the active site if they were within about 12 angstroms of any ofthe residues within the catalytic triad of the OLS enzyme and/or withinabout 12 angstroms of a docked substrate within the OLS enzyme.

A subset of OLS point mutations was identified that included strainsthat produced at least 10 mg/L olivetol and mapped to within the activesite. This group of point mutations included: T17K, I23C, L25R, K51R,D54R, F64Y, V95A, T123C, A125S, Y153G, E196K, L201C, I207L, L241I,T247A, M267K, M267G, I273V, L277M, T296A, V307I, D320A, V324I, S326R,H328Y, S334P, S334A, T335C, R375T (Table 9). FIG. 17 provides aschematic of the 3D structure of the C. sativa OLS protein(corresponding to SEQ ID NO: 5), showing the catalytic triad, the boundhexanoyl-CoA substrate, and the cluster of point mutations identifiedwithin the active site.

OLS point mutations from strains that produced at least 10 mg/L olivetoland mapped to within about 8 angstroms of any of the residues within thecatalytic triad of the OLS enzyme and/or within about 8 angstroms of adocked substrate within the OLS included: K51R, D54R, T123C, A125S,L201C, I207L, L241I, T247A, M267K, M267G, I273V, T296A, V307I, V324I,S326R, H328Y, S334P, T335C, and R375T. FIG. 16 provides a schematic ofthe 3D structure of the C. Sativa OLS protein (corresponding to SEQ IDNO: 5), showing the catalytic triad, the bound hexanoyl-CoA substrate,and the cluster of point mutations identified within about 8 angstromsof any of the residues within the catalytic triad of the OLS enzymeand/or within about 8 angstroms of a docked substrate within the PKS.

The point mutation that was found to be associated with the mostolivetol production was a T335C mutation in the C. sativa OLS sequence(Table 9). This residue maps to the active site of the OLS enzyme (FIGS.16-17). In further support of the importance of this residue forolivetol production, at least 5 of the high-producing olivetol candidateOLSs identified in Example 3 contain a C residue at this position(strain IDs t527346 (SEQ ID NO: 7), t598265 (SEQ ID NO: 13), t598301(SEQ ID NO: 7), t598916 (SEQ ID NO: 145), t598976 (SEQ ID NO: 8),t599231 (SEQ ID NO: 15)). Strains t527346 and t598301 comprise an OLSthat has the same amino acid sequence but the OLS is encoded bydifferent nucleic acid sequences.

Additional C. sativa OLS variants were identified that did not mapwithin the active site, but which were observed to produce more thanapproximately 13 mg/L olivetol (Table 9). This group of point mutationsincluded: I284Y, K100L, K116R, I278E, K108D, L348S, K71R, V92G, T128V,K100M, Y135V, P229A, T128A, T128I (Table 9).

Table 13 provides sequence information for strains described in Table 9.

Thus, novel variants of the C. sativa OLS protein that may be useful forolivetol production in recombinant host cells were identified.

Example 5. Generation and Screening of C. sativa OLS (CsOLS), CymbidiumOLS (ChOLS), and Corchorus OLS (CoOLS) Protein Engineering Libraries inS. cerevisiae

An additional OLS protein engineering library was generated thatincluded OLS variants based on three different OLS templates: C. sativaOLS (CsOLS); Cymbidium hybrid cultivar OLS (ChOLS) and Corchorusolitorius OLS (CoOLS), which were among the candidate OLSs identified inthe screens described in Examples 2 and 3. As discussed above, ChOLS wasidentified as being one of the strongest olivetol-producing candidateOLS enzymes, while CoOLS was identified as being a potentialbifunctional enzyme possessing both polyketide synthase and polyketidecyclase catalytic functions.

The library included approximately 300 variants of ChOLS andapproximately 200 variants of CoOLS. Variants of ChOLS and CoOLSincluded both single and multiple amino acid substitutions. For ChOLS,some of the variants were designed by taking beneficial mutationsdiscovered from screening of CsOLS variants described in Example 4 andmapping the corresponding mutations onto the ChOLS template.Corresponding positions in ChOLS were identified and mutated.

For CoOLS, some of the variants were designed to investigate whetherthere were any specific residues that may contribute to conferring orenhancing bifunctionality. The sequence of the bifunctional CoOLS enzyme(SEQ ID NO: 6) was aligned with the sequence of the CsOLS enzyme (SEQ IDNO: 5), which is not bifunctional, and residues that are differentbetween the sequences were considered for mutagenesis in both the CsOLSand CoOLS sequences. The impact of these mutations on bifunctionalitywas investigated by measuring the ratio of production of olivetolic acidto olivetol. A specific residue that was investigated with respect tobifunctionality was residue W339 in CoOLS, which corresponds to residueS332 in CsOLS.

Nucleotide sequences of the genes within the library werecodon-optimized for expression in S. cerevisiae and synthesized in thereplicative yeast expression vector shown in FIGS. 5A-5B. Each candidateOLS expression construct was transformed into a S. cerevisiae CEN.PKstrain expressing a heterologous AAE VcsA-Q6N4N8 from R. palustris. Thelibrary was screened in a high-throughput primary screen in which theOLS assay was conducted as described in Example 2, except thatproduction cultures were not supplemented with either sodium hexanoateor sodium butyrate. Instead the strains' natural pools of hexanoyl-CoAand butyryl-CoA were used as substrates. Top olivetol and/or olivetolicacid producing strains were carried over to a secondary screen to verifyproduction of olivetol and/or olivetolic acid. The experimental protocolfor the secondary screen was identical to the primary screen, exceptthat four replicates per strain were tested; and olivetol production wasassessed both in the context of production cultures being supplementedwith sodium hexanoate and in the context of production cultures thatwere not being supplemented with sodium hexanoate.

Strain t527338, expressing a fluorescent protein, was included in thelibrary as a negative control for enzyme activity. Strain t527340,expressing wild-type CsOLS, was included in the library as a positivecontrol. Strain t527346, expressing wild-type ChOLS, was included in thelibrary as a positive control and was used to establish hit ranking forvariants designed using ChOLS as a template. Similarly, strain t606797,expressing wild-type CoOLS, was included in the library as a positivecontrol and was used to establish hit ranking for variants designedusing CoOLS as a template. Olivetol was normalized to the meanproduction of its wild-type template (e.g., olivetol produced by avariant of ChOLS was normalized to the mean olivetol titer produced bystrain t527346) except that for variants made to the CoOLS template,olivetol was normalized against each of a CsOLS template and a CoOLStemplate due to inconsistent activity of the CoOLS wild type control(Tables 10A-B and 11A-B). The Average Normalized Olivetol value forwild-type templates was not necessarily 1.0. For example, for thewild-type C. sativa strain t527340 in Table 10B, the Average NormalizedOlivetol value was 1.02159. This was because the mean by which valueswere normalized was based on library controls that were included on eachplate within a screen (e.g., strain t527340). The library furthercontained additional in-library controls of the same strain (e.g.,strain t527340). Those additional in-library controls were not used tocalculate the mean. In instances where the average normalized olivetolvalues for all samples of strain t527340 were calculated, if thein-library controls produced slightly more olivetol than the meanolivetol produced by the library controls that were included on eachplate, then the Average Normalized Olivetol value was slightly above1.0.

Results from the secondary screen are provided in Tables 10-11. Table 10provides results for samples that were supplemented with sodiumhexanoate, while Table 11 provides results for samples that were notsupplemented with sodium hexanoate. In Table 10, strains comprisingChOLS mutants that produced an average normalized olivetol level of atleast 0.5 are shown. The performance of multi-mutation ChOLS enzymes arealso shown.

For ChOLS, the approach of mapping equivalent variants from the CsOLSsequence led to the identification of multiple variants that exhibitedimproved olivetol production. These variants included the followingpoint mutations: V71Y, F70M, L385M, E285A, L76I, N151P, E203K, V50N,S34Q, R100P, A219C, K359M, and R100T (Table 11). Several additionalvariants exhibited improved olivetol production in samples that weresupplemented with sodium hexanoate. These variants included thefollowing point mutations: V71Y and F70M (Table 10).

For CoOLS, when the mutation W339S was made to the CoOLS template(strain t607112), the ratio of olivetolic acid to olivetol decreased,from approximately 1.5 to approximately 0.157 (based on 517.075 ug/Lolivetol and 81.225 ug/L olivetolic acid, as shown in Table 10A).However, olivetol levels reported in Table 10A for strain t607112 werewithin the standard deviation. Accordingly, while the mutation may havehad an impact on bifunctionality, it also appears to have more generallyaffected overall functionality of the enzyme. The reverse mutation wasalso tested in CsOLS. For CsOLS, S332W (strain t606899) had asignificantly negative impact on the function of the enzyme (Table 10A).Similarly, mutation S339W in ChOLS (strain t607377) had a significantlynegative impact on the overall function of the enzyme (Table 10A).

Example 6. Functional Expression of OLS Enzymes in a Prototrophic S.cerevisiae Strain

Examples 2-5 utilized an auxotrophic S. cerevisiae CEN.PK strain as ahost chassis for the expression of OLS enzyme candidates from areplicative plasmid. OLS candidate enzymes determined to be active inExamples 2-5 were also assessed in a prototrophic S. cerevisiae CEN.PKstrain.

A library of approximately 58 OLS genes under the control of the samegenetic regulatory elements shown in FIGS. 5A-5B (GAL1 promoter and CYC1terminator) were integrated into the genome of a prototrophic S.cerevisiae CEN.PK strain. The parental chassis strain t473139, notexpressing a heterologous OLS enzyme, was included as a negative controlfor enzyme activity. Strain t496084, expressing the CsOLS T335Cpoint-mutant, which was the highest ranking CsOLS point mutantidentified in Example 5 based on production of olivetol, was alsoincluded. The OLS assay was conducted as described in Example 2 with thefollowing exceptions: glycerol stocks were stamped into YEP+4% glucose;a portion of the resulting cultures were then stamped into productioncultures containing YEP+4% glucose+1 mM sodium hexanoate; and threebio-replicates were used instead of two.

Despite differences between auxotrophic and prototrophic strains thatmay impact production of olivetol, candidate OLS enzymes identified inExamples 2-5 through screening in auxotrophic strains were also found tobe effective in production of olivetol in a prototrophic strain (Table12 and FIG. 15). As shown in Table 12, strain t496073, corresponding toa prototrophic S. cerevisiae strain comprising a chromosomallyintegrated, codon-optimized nucleotide sequence encoding the OLScandidate from Cymbidium hybrid cultivar (Accession ID: A0A088G5Z5),which was identified in Examples 2 and 3, produced the highest olivetoltiter of any library member and significantly more olivetol than the C.sativa control (FIG. 15 and Table 12).

Thus, novel candidate OLS enzymes identified in Examples 2-5 were foundto be effective for olivetol production when expressed in prototrophicstrains as well as auxotrophic strains.

Example 7. Biosynthesis of Cannabinoids in Engineered S. cerevisiae HostCells

The activation of an organic acid to its CoA-thioester and thesubsequent condensation of this thioester with a number of malonyl-CoAmolecules, or other similar polyketide extender units, represent thefirst two steps in the biosynthesis of all known cannabinoids. Todemonstrate the biosynthesis of CBGA (FIG. 1, Formula (8a)), CBDA (FIG.1, Formula (9a)), THCA (FIG. 1, Formula (10a)), and CBCA (FIG. 1,Formula (11a)) the cannabinoid biosynthetic pathway shown in FIG. 1 isassembled in the genome of a prototrophic S. cerevisiae CEN.PK host cellwherein each enzyme (Rla-R5a) may be present in one or more copies. Forexample, the S. cerevisiae host cell may express one or more copies ofone or more of: an AAE, an OLS, an OAC, a CBGAS, and a TS.

An AAE enzyme expressed heterologously in a host cell may be one or moreof the AAE candidates from Y. lipolytica or R. palustris that are shownin Example 1 to be functionally expressed in S. cerevisiae. An OLSenzyme expressed heterologously in a host cell may be an OLS identifiedand characterized in Examples 2-8, such as a Cymbidium hybrid cultivarOLS (SEQ ID NO: 7) or a Phalaenopsis x Doritaenopsis hybrid cultivar OLS(SEQ ID NO: 15), or an OLS corresponding to SEQ ID NO: 145. The OLSenzyme may also be an engineered OLS such as CsOLS T335C (SEQ ID NO:207) or an engineered version of any other OLS enzyme described in thisdisclosure. An OAC enzyme expressed heterologously in a host cell may bea naturally occurring or synthetic OAC that is functionally expressed inS. cerevisiae, or a variant thereof, including an OAC from C. sativa ora variant of an OAC from C. sativa. In instances where a bifunctionalOLS, such as Corchorus olitorius OLS (SEQ ID NO: 6), is used, a separateOAC enzyme may be omitted.

A CBGAS enzyme, such as a PT enzyme, expressed heterologously in a hostcell may be a naturally occurring or synthetic PT that is functionallyexpressed in S. cerevisiae, or a variant thereof, including a PT from C.sativa or a variant of a PT from C. sativa. For example, a PT maycomprise CsPT4 from C. sativa, or a variant thereof, or NphB fromStreptomyces sp. Strain CL190, or a variant thereof.

A TS enzyme expressed heterologously in a host cell may be a naturallyoccurring or synthetic TS that is functionally expressed in S.cerevisiae, or a variant thereof, including a TS from C. sativa or avariant of a TS from C. sativa. The TS enzyme may be a TS that producesone or more of CBDA, THCA, and CBCA as a majority product.

The cannabinoid fermentation procedure may be similar to the OLS assaydescribed in the Examples above with the following exceptions: theincubation of production cultures may last from, for example, 48-144hours, and production cultures may be supplemented with, for example, 4%galactose and 1 mM sodium hexanoate approximately every 24 hours. Titersof CBGA, CBDA, THCA, and CBCA may be quantified via LC-MS.

It should be appreciated that sequences disclosed herein may or may notcontain signal sequences. The sequences disclosed herein encompassversions with or without signal sequences. It should also be understoodthat protein sequences disclosed herein may be depicted with or withouta start codon (M). Accordingly, in some instances amino acid numberingmay correspond to protein sequences containing a start codon, while inother instances, amino acid numbering may correspond to proteinsequences that do not contain a start codon. Aspects of the disclosureencompass host cells comprising any of the sequences described herein,including the sequences within Tables 4-6, and 13-16 and fragmentsthereof.

Additional Tables Associated with the Disclosure

TABLE 4 Sequence Information For Strains Described in Example 1 StrainID AAE Sequence Information t49578 This strain comprises acodon-optimized nucleic acid (SEQ ID NO: 70), which encodes a (E. coli)Yarrowia lipolytica protein (SEQ ID NO: 63). The protein sequence of SEQID NO: 63 corresponds to the protein sequence provided by UniProtAccession No. Q6CFE4. t49594 This strain comprises a codon-optimizednucleic acid (SEQ ID NO: 71), which encodes a (E. coli) Yarrowialipolytica protein (SEQ ID NO: 64). The protein sequence of SEQ ID NO:64 corresponds to the protein sequence provided by UniProt Accession No.Q6C577. This protein was expressed as a fusion protein with anN-terminal MYC tag (SEQ ID NO: 140). SEQ ID NO: 707 corresponds to thefusion protein. SEQ ID NO: 712 is a codon-optimized nucleic acidencoding SEQ ID NO: 707. t51477 This strain comprises a codon-optimizednucleic acid (SEQ ID NO: 72), which encodes a (E. coli) Yarrowialipolytica protein (SEQ ID NO: 65). The protein sequence of SEQ ID NO:65 corresponds to the protein sequence provided by UniProt Accession No.Q6C650. This protein was expressed as a fusion protein with anN-terminal MYC tag (SEQ ID NO: 140). SEQ ID NO: 708 corresponds to thefusion protein). SEQ ID NO: 713 is a codon- optimized nucleic acidencoding SEQ ID NO: 708. t392878 This strain comprises a codon-optimizednucleic acid (SEQ ID NO: 75), which encodes a (S. cerevisiae) Yarrowialipolytica protein (SEQ ID NO: 141). SEQ ID NO: 141 corresponds toresidues 1-595 of SEQ ID NO: 68. The protein sequence of (SEQ ID NO:141) corresponds to the protein sequence provided by UniProt AccessionNo. Q6C577 except that the last three residues (peroxisomal targetingsignal 1) were removed. t392879 This strain comprises a codon-optimizednucleic acid (SEQ ID NO: 76), which encodes a (S. cerevisiae) Yarrowialipolytica protein (SEQ ID NO: 142). SEQ ID NO: 142 corresponds toresidues 1-595 of SEQ ID NO: 69. The protein sequence of (SEQ ID NO:142) corresponds to the protein sequence provided by UniProt AccessionNo. Q6C577 except that the last three residues (peroxisomal targetingsignal 1) were removed. t55127 This strain comprises a codon-optimizednucleic acid (SEQ ID NO: 73), which encodes a (E. coli)Rhodopseudotnonas palustris protein (SEQ ID NO: 66). The proteinsequence of SEQ ID NO: 66 corresponds to the protein sequence providedby UniProt Accession No. Q6N948. t55128 This strain comprises acodon-optimized nucleic acid (SEQ ID NO: 74), which encodes a (E. coli)Rhodopseudotnonas palustris protein (SEQ ID NO: 67). The proteinsequence of SEQ ID NO: 67 corresponds to the protein sequence providedby UniProt Accession No. Q6N4N8. t49580 This strain comprises acodon-optimized nucleic acid (SEQ ID NO:72), which encodes a (E. coli)Yarrowia lipolytica protein (SEQ ID NO: 65). The protein sequence of SEQID NO: 65 corresponds to the protein sequence provided by UniProtAccession No. Q6C650.

TABLE 5 Sequence Information for Strains Described in Example 2 andFIGS. 8-10 Strain ID OLS Sequence Information t394087 This straincomprises a codon-optimized nucleic acid that corresponds to nucleotides3-1182 of SEQ ID NO: 32 (due to a truncation of nucleotides 1-2 of SEQID NO: 32). Translation of this sequence is expected to produce atruncated version of a protein corresponding to SEQ ID NO. 1. Theprotein sequence of SEQ ID NO: 1 corresponds to the protein sequenceprovided by UniProt Accession No. A0A2G5F4L7, from Aquilegia coerulea(Rocky mountain columbine) t394687 This strain comprises acodon-optimized nucleic acid that corresponds to nucleotides 3-1179 ofSEQ ID NO: 33 (due to a truncation of nucleotides 1-2 of SEQ ID NO: 33).Translation of this sequence is expected to produce a truncated versionof a protein corresponding to SEQ ID NO. 2. The protein sequence of SEQID NO: 2 corresponds to the protein sequence provided by UniProtAccession No. I6VW41, from Vitis pseudoreticulata (Chinese wildgrapevine) t393495 This strain comprises a codon-optimized nucleic acidthat corresponds to nucleotides 3-1185 of SEQ ID NO: 34 (due to atruncation of nucleotides 1-2 of SEQ ID NO: 34). Translation of thissequence is expected to produce a truncated version of a proteincorresponding to SEQ ID NO. 3. The protein sequence of SEQ ID NO: 3corresponds to the protein sequence provided by UniProt Accession No.M4DVZ4, from Brassica rapa subsp. pekinensis (Chinese cabbage) (Brassicapekinensis) t393563 This strain comprises a codon-optimized nucleic acidthat corresponds to nucleotides 3-1197 of SEQ ID NO: 35 (due to atruncation of nucleotides 1-2 of SEQ ID NO: 35). Translation of thissequence is expected to produce a truncated version of a proteincorresponding to SEQ ID NO. 4. The protein sequence of SEQ ID NO: 4corresponds to the protein sequence provided by UniProt Accession No.Q8VWQ7, from Sorghum bicolor (Sorghum) (Sorghum vulgare) t339568 Thisstrain comprises a codon-optimized nucleic acid that corresponds tonucleotides 3-1158 of SEQ ID NO: 36 (due to a truncation of nucleotides1-2 of SEQ ID NO: 36). Translation of this sequence is expected toproduce a truncated version of a protein corresponding to SEQ ID NO. 5.The protein sequence of SEQ ID NO: 5 corresponds to the protein sequenceprovided by UniProt Accession No. B1Q2B6, from Cannabis sativa t393974This strain comprises a codon-optimized nucleic acid that corresponds tonucleotides 3-1191 of SEQ ID NO: 37 (due to a truncation of nucleotides1-2 of SEQ ID NO: 37). Translation of this sequence is expected toproduce a truncated version of a protein corresponding to SEQ ID NO. 6.The protein sequence of SEQ ID NO: 6 corresponds to the protein sequenceprovided by UniProt Accession No. A0A1R3HSU5, from Corchorus olitoriust393991 This strain comprises a codon-optimized nucleic acid thatcorresponds to nucleotides 3-1173 of SEQ ID NO: 38 (due to a truncationof nucleotides 1-2 of SEQ ID NO: 38). Translation of this sequence isexpected to produce a truncated version of a protein corresponding toSEQ ID NO. 7. The protein sequence of SEQ ID NO: 7 corresponds to theprotein sequence provided by UniProt Accession No. A0A088G5Z5, fromCymbidium hybrid cultivar t394336 This strain comprises acodon-optimized nucleic acid that corresponds to nucleotides 3-1185 ofSEQ ID NO: 39 (due to a truncation of nucleotides 1-2 of SEQ ID NO: 39).Translation of this sequence is expected to produce a truncated versionof a protein corresponding to SEQ ID NO. 8. The protein sequence of SEQID NO: 8 corresponds to the protein sequence provided by UniProtAccession No. A0A0A6Z8B1, from Paphiopedilum helenae t394547 This straincomprises a codon-optimized nucleic acid that corresponds to nucleotides3-1185 of SEQ ID NO: 40 (due to a truncation of nucleotides 1-2 of SEQID NO: 40). Translation of this sequence is expected to produce atruncated version of a protein corresponding to SEQ ID NO. 9. Theprotein sequence of SEQ ID NO: 9 corresponds to the protein sequenceprovided by UniProt Accession No. A0A078IM49, from Brassica napus (Rape)t394457 This strain comprises a codon-optimized nucleic acid thatcorresponds to nucleotides 3-1191 of SEQ ID NO: 41 (due to a truncationof nucleotides 1-2 of SEQ ID NO: 41). Translation of this sequence isexpected to produce a truncated version of a protein corresponding toSEQ ID NO. 10. The protein sequence of SEQ ID NO: 10 corresponds to theprotein sequence provided by UniProt Accession No. A0A140KXU1, fromPicea jezoensis t394521 This strain comprises a codon-optimized nucleicacid that corresponds to nucleotides 3-1191 of SEQ ID NO: 42 (due to atruncation of nucleotides 1-2 of SEQ ID NO: 42). Translation of thissequence is expected to produce a truncated version of a proteincorresponding to SEQ ID NO. 11. The protein sequence of SEQ ID NO: 11corresponds to the protein sequence provided by UniProt Accession No.P48408, from Pinus strobus (Eastern white pine) t394790 This straincomprises a codon-optimized nucleic acid that corresponds to nucleotides3-1170 of SEQ ID NO: 43 (due to a truncation of nucleotides 1-2 of SEQID NO: 43). Translation of this sequence is expected to produce atruncated version of a protein corresponding to SEQ ID NO. 12. Theprotein sequence of SEQ ID NO: 12 corresponds to the protein sequenceprovided by UniProt Accession No. I3QQ50, from Arachis hypogaea (Peanut)t394905 This strain comprises a codon-optimized nucleic acid thatcorresponds to nucleotides 3-1296 of SEQ ID NO: 44 (due to a truncationof nucleotides 1-2 of SEQ ID NO: 44). Translation of this sequence isexpected to produce a truncated version of a protein corresponding toSEQ ID NO. 13. The protein sequence of SEQ ID NO: 13 corresponds to theprotein sequence provided by UniProt Accession No. A0A1S4ATN2, fromNicotiana tabacum (Common tobacco) t394981 This strain comprises acodon-optimized nucleic acid that corresponds to nucleotides 3-1197 ofSEQ ID NO: 45 (due to a truncation of nucleotides 1-2 of SEQ ID NO: 45).Translation of this sequence is expected to produce a truncated versionof a protein corresponding to SEQ ID NO. 14. The protein sequence of SEQID NO: 14 corresponds to the protein sequence provided by UniProtAccession No. K3Y7T4, from Setaria italica (Foxtail millet) (Panicumitalicum) t395011 This strain comprises a codon-optimized nucleic acidthat corresponds to nucleotides 3-1173 of SEQ ID NO: 46 (due to atruncation of nucleotides 1-2 of SEQ ID NO: 46). Translation of thissequence is expected to produce a truncated version of a proteincorresponding to SEQ ID NO. 15. The protein sequence of SEQ ID NO: 15corresponds to the protein sequence provided by UniProt Accession No.Q6WJD6, from Phalaenopsis x Doritaenopsis hybrid cultivar t394797 Thisstrain comprises a codon-optimized nucleic acid that corresponds tonucleotides 3-1170 of SEQ ID NO: 47 (due to a truncation of nucleotides1-2 of SEQ ID NO: 47). Translation of this sequence is expected toproduce a truncated version of a protein corresponding to SEQ ID NO. 16.The protein sequence of SEQ ID NO: 16 corresponds to the proteinsequence provided by UniProt Accession No. K7XD27, from Arachis hypogaea(Peanut) t395094 This strain comprises a codon-optimized nucleic acidthat corresponds to nucleotides 3-1179 of SEQ ID NO: 48 (due to atruncation of nucleotides 1-2 of SEQ ID NO: 48). Translation of thissequence is expected to produce a truncated version of a proteincorresponding to SEQ ID NO. 17. The protein sequence of SEQ ID NO: 17corresponds to the protein sequence provided by UniProt Accession No.A0A0D6QTX3, from Araucaria cunninghamii (Hoop pine) (Moreton Bay pine)t395103 This strain comprises a codon-optimized nucleic acid thatcorresponds to nucleotides 3-1182 of SEQ ID NO: 49 (due to a truncationof nucleotides 1-2 of SEQ ID NO: 49). Translation of this sequence isexpected to produce a truncated version of a protein corresponding toSEQ ID NO. 18. The protein sequence of SEQ ID NO: 18 corresponds to theprotein sequence provided by UniProt Accession No. V7AZ15, fromPhaseolus vulgaris (common bean) t393835 This strain comprises acodon-optimized nucleic acid that corresponds to nucleotides 3-1179 ofSEQ ID NO: 50 (due to a truncation of nucleotides 1-2 of SEQ ID NO: 50).Translation of this sequence is expected to produce a truncated versionof a protein corresponding to SEQ ID NO. 19. The protein sequence of SEQID NO: 19 corresponds to the protein sequence provided by UniProtAccession No. I6S977, from Vitis quinquangularis t394115 This straincomprises a codon-optimized nucleic acid that corresponds to nucleotides3-1188 of SEQ ID NO: 51 (due to a truncation of nucleotides 1-2 of SEQID NO: 51). Translation of this sequence is expected to produce atruncated version of a protein corresponding to SEQ ID NO: 20. Theprotein sequence of SEQ ID NO: 20 corresponds to the protein sequenceprovided by UniProt Accession No. Q9FR69, from Cardamine penzesiit394091 This strain comprises a codon-optimized nucleic acid thatcorresponds to nucleotides 3-1170 of SEQ ID NO: 52 (due to a truncationof nucleotides 1-2 of SEQ ID NO: 52). Translation of this sequence isexpected to produce a truncated version of a protein corresponding toSEQ ID NO. 21. The protein sequence of SEQ ID NO: 21 corresponds to theprotein sequence provided by UniProt Accession No. G7IQL2, from Medicagotruncatula (Barrel medic) (Medicago tribuloides) t394037 This straincomprises a codon-optimized nucleic acid that corresponds to nucleotides3-1179 of SEQ ID NO: 53 (due to a truncation of nucleotides 1-2 of SEQID NO: 53). Translation of this sequence is expected to produce atruncated version of a protein corresponding to SEQ ID NO. 22. Theprotein sequence of SEQ ID NO: 22 corresponds to the protein sequenceprovided by UniProt Accession No. I6W888, from Vitis pseudoreticulata(Chinese wild grapevine) t394279 This strain comprises a codon-optimizednucleic acid that corresponds to nucleotides 3-1188 of SEQ ID NO: 54(due to a truncation of nucleotides 1-2 of SEQ ID NO: 54). Translationof this sequence is expected to produce a truncated version of a proteincorresponding to SEQ ID NO. 23. The protein sequence of SEQ ID NO: 23corresponds to the protein sequence provided by UniProt Accession No.P13114, from Arabidopsis thaliana (Mouse-ear cress) t394043 This straincomprises a codon-optimized nucleic acid that corresponds to nucleotides3-1344 of SEQ ID NO: 55 (due to a truncation of nucleotides 1-2 of SEQID NO: 55). Translation of this sequence is expected to produce atruncated version of a protein corresponding to SEQ ID NO. 24. Theprotein sequence of SEQ ID NO: 24 corresponds to the protein sequenceprovided by UniProt Accession No. A0A251SHA8, from Helianthus annuus(Common sunflower) t394404 This strain comprises a codon-optimizednucleic acid that corresponds to nucleotides 3-1170 of SEQ ID NO: 56(due to a truncation of nucleotides 1-2 of SEQ ID NO: 56). Translationof this sequence is expected to produce a truncated version of a proteincorresponding to SEQ ID NO. 25. The protein sequence of SEQ ID NO: 25corresponds to the protein sequence provided by UniProt Accession No.X5I326, from Vaccinium ashei t394436 This strain comprises acodon-optimized nucleic acid that corresponds to nucleotides 3-1197 ofSEQ ID NO: 57 (due to a truncation of nucleotides 1-2 of SEQ ID NO: 57).Translation of this sequence is expected to produce a truncated versionof a protein corresponding to SEQ ID NO. 26. The protein sequence of SEQID NO: 26 corresponds to the protein sequence provided by UniProtAccession No. A0A164ZDA1, from Daucus carota subsp. Sativus t393720 Thisstrain comprises a codon-optimized nucleic acid that corresponds tonucleotides 3-1212 of SEQ ID NO: 58 (due to a truncation of nucleotides1-2 of SEQ ID NO: 58). Translation of this sequence is expected toproduce a truncated version of a protein corresponding to SEQ ID NO. 27.The protein sequence of SEQ ID NO: 27 corresponds to the proteinsequence provided by UniProt Accession No. Q58VP7, from Aloe arborescens(Kidachi aloe) t394911 This strain comprises a codon-optimized nucleicacid that corresponds to nucleotides 3-1203 of SEQ ID NO: 59 (due to atruncation of nucleotides 1-2 of SEQ ID NO: 59). Translation of thissequence is expected to produce a truncated version of a proteincorresponding to SEQ ID NO. 28. The protein sequence of SEQ ID NO: 28corresponds to the protein sequence provided by UniProt Accession No.A0A2K3P0B5, from Trifolium pratense (Red clover) t395023 This straincomprises a codon-optimized nucleic acid that corresponds to nucleotides3-1200 of SEQ ID NO: 60 (due to a truncation of nucleotides 1-2 of SEQID NO: 60). Translation of this sequence is expected to produce atruncated version of a protein corresponding to SEQ ID NO. 29. Theprotein sequence of SEQ ID NO: 29 corresponds to the protein sequenceprovided by UniProt Accession No. Q8GZP4, from Hydrangea macrophylla(Bigleaf hydrangea) (Viburnum macrophyllum) t339579 This straincomprises a codon-optimized nucleic acid that corresponds to nucleotides3-1158 of SEQ ID NO: 61 (due to a truncation of nucleotides 1-2 of SEQID NO: 61). Translation of this sequence is expected to produce atruncated version of a protein corresponding to SEQ ID NO. 30. Theprotein sequence of SEQ ID NO: 30 corresponds to the protein sequenceprovided by UniProt Accession No. B1Q2B6, from C. sativa t339582 Thisstrain comprises a codon-optimized nucleic acid that corresponds tonucleotides 3-1158 of SEQ ID NO: 62 (due to a truncation of nucleotides1-2 of SEQ ID NO: 62). Translation of this sequence is expected toproduce a truncated version of a protein corresponding to SEQ ID NO. 31.The protein sequence of SEQ ID NO: 31 corresponds to the proteinsequence provided by UniProt Accession No. B1Q2B6, from C. sativat394396 This strain comprises a codon-optimized nucleic acid thatcorresponds to nucleotides 3-1158 of SEQ ID NO: 93 (due to a truncationof nucleotides 1-2 of SEQ ID NO: 93). Translation of this sequence isexpected to produce a truncated version of a protein corresponding toSEQ ID NO. 77. The protein sequence of SEQ ID NO: 77 corresponds to theprotein sequence provided by UniProt Accession No. B1Q2B6, from C.sativa t339546 This strain comprises a codon-optimized nucleic acid thatcorresponds to nucleotides 3-1158 of SEQ ID NO: 94 (due to a truncationof nucleotides 1-2 of SEQ ID NO: 94). Translation of this sequence isexpected to produce a truncated version of a protein corresponding toSEQ ID NO: 78. The protein sequence of SEQ ID NO: 78 corresponds to theprotein sequence provided by UniProt Accession No. B1Q2B6, from C.sativa. t339549 This strain comprises a codon-optimized nucleic acidthat corresponds to nucleotides 3-1158 of SEQ ID NO: 95 (due to atruncation of nucleotides 1-2 of SEQ ID NO: 95). Translation of thissequence is expected to produce a truncated version of a proteincorresponding to SEQ ID NO. 79. The protein sequence of SEQ ID NO: 79corresponds to the protein sequence provided by UniProt Accession No.B1Q2B6, from C. sativa t393360 This strain comprises a codon-optimizednucleic acid that corresponds to nucleotides 3-1158 of SEQ ID NO: 96(due to a truncation of nucleotides 1-2 of SEQ ID NO: 96). Translationof this sequence is expected to produce a truncated version of a proteincorresponding to SEQ ID NO. 80. The protein sequence of SEQ ID NO: 80corresponds to the protein sequence provided by UniProt Accession No.F1LKH5, from C. sativa t393555 This strain comprises a codon-optimizednucleic acid that corresponds to nucleotides 3-1188 of SEQ ID NO: 97(due to a truncation of nucleotides 1-2 of SEQ ID NO: 97). Translationof this sequence is expected to produce a truncated version of a proteincorresponding to SEQ ID NO. 81. The protein sequence of SEQ ID NO: 81corresponds to the protein sequence provided by UniProt Accession No.Q9SEN0, from Fourraea alpina (Rock-cress) (Arabis pauciflora) t394593This strain comprises a codon-optimized nucleic acid that corresponds tonucleotides 3-1197 of SEQ ID NO: 98 (due to a truncation of nucleotides1-2 of SEQ ID NO: 98). Translation of this sequence is expected toproduce a truncated version of a protein corresponding to SEQ ID NO. 82.The protein sequence of SEQ ID NO: 82 corresponds to the proteinsequence provided by UniProt Accession No. A0A059VFD5, from Punicagranatum (Pomegranate) t394351 This strain comprises a codon-optimizednucleic acid that corresponds to nucleotides 3-1167 of SEQ ID NO: 99(due to a truncation of nucleotides 1-2 of SEQ ID NO: 99). Translationof this sequence is expected to produce a truncated version of a proteincorresponding to SEQ ID NO. 83. The protein sequence of SEQ ID NO: 83corresponds to the protein sequence provided by UniProt Accession No.Q1G6T7, from Cardamine apennina t394414 This strain comprises acodon-optimized nucleic acid that corresponds to nucleotides 3-1068 ofSEQ ID NO: 100 (due to a truncation of nucleotides 1-2 of SEQ ID NO:100). Translation of this sequence is expected to produce a truncatedversion of a protein corresponding to SEQ ID NO. 84. The proteinsequence of SEQ ID NO: 84 corresponds to the protein sequence providedby UniProt Accession No. A0A2T5VUN1, from Mycobacterium sp. YR782t393402 This strain comprises a codon-optimized nucleic acid thatcorresponds to nucleotides 3-1173 of SEQ ID NO: 101 (due to a truncationof nucleotides 1-2 of SEQ ID NO: 101). Translation of this sequence isexpected to produce a truncated version of a protein corresponding toSEQ ID NO. 85. The protein sequence of SEQ ID NO: 85 corresponds to theprotein sequence provided by UniProt Accession No. A0A1Q9SCX4, fromKocuria sp. CNJ-770 t394035 This strain comprises a codon-optimizednucleic acid that corresponds to nucleotides 3-675 of SEQ ID NO: 102(due to a truncation of nucleotides 1-2 of SEQ ID NO: 102). Translationof this sequence is expected to produce a truncated version of a proteincorresponding to SEQ ID NO. 86. The protein sequence of SEQ ID NO: 86corresponds to the protein sequence provided by UniProt Accession No.A0A0K8QHJ1 from Arthrobacter sp. Hiyol t394155 This strain comprises acodon-optimized nucleic acid that corresponds to nucleotides 3-1176 ofSEQ ID NO: 103 (due to a truncation of nucleotides 1-2 of SEQ ID NO:103). Translation of this sequence is expected to produce a truncatedversion of a protein corresponding to SEQ ID NO. 87. The proteinsequence of SEQ ID NO: 87 corresponds to the protein sequence providedby UniProt Accession No. Q9XJ57 from Citrus sinensis (Sweet orange)(Citrus aurantium var. sinensis) t394137 This strain comprises acodon-optimized nucleic acid that corresponds to nucleotides 3-1173 ofSEQ ID NO: 104 (due to a truncation of nucleotides 1-2 of SEQ ID NO:104). Translation of this sequence is expected to produce a truncatedversion of a protein corresponding to SEQ ID NO. 88. The proteinsequence of SEQ ID NO: 88 corresponds to the protein sequence providedby UniProt Accession No. I6R2S0 from Narcissus tazetta var. chinensist393976 This strain comprises a codon-optimized nucleic acid thatcorresponds to nucleotides 3-1188 of SEQ ID NO: 105 (due to a truncationof nucleotides 1-2 of SEQ ID NO: 105). Translation of this sequence isexpected to produce a truncated version of a protein corresponding toSEQ ID NO. 89. The protein sequence of SEQ ID NO: 89 corresponds to theprotein sequence provided by UniProt Accession No. Q2ENA5 from Abiesalba (Edeltanne) (European silver fir) t394689 This strain comprises acodon-optimized nucleic acid that corresponds to nucleotides 3-1173 ofSEQ ID NO: 106 (due to a truncation of nucleotides 1-2 of SEQ ID NO:106). Translation of this sequence is expected to produce a truncatedversion of a protein corresponding to SEQ ID NO. 90. The proteinsequence of SEQ ID NO: 90 corresponds to the protein sequence providedby UniProt Accession No. A0A022RTH3 from Erythranthe guttata (Yellowmonkey flower) t393400 This strain comprises a codon-optimized nucleicacid that corresponds to nucleotides 3-1053 of SEQ ID NO: 107 (due to atruncation of nucleotides 1-2 of SEQ ID NO: 107). Translation of thissequence is expected to produce a truncated version of a proteincorresponding to SEQ ID NO. 91. The protein sequence of SEQ ID NO: 91corresponds to the protein sequence provided by UniProt Accession No.A0A2T7T652 from Streptomyces scopuliridis RB72 t394693 This straincomprises a codon-optimized nucleic acid that corresponds to nucleotides3-1188 of SEQ ID NO: 108 (due to a truncation of nucleotides 1-2 of SEQID NO: 108). Translation of this sequence is expected to produce atruncated version of a protein corresponding to SEQ ID NO: 92. Theprotein sequence of SEQ ID NO: 92 corresponds to the protein sequenceprovided by UniProt Accession No. Q2EFKO, from Abies alba (Edeltanne)(European silver fir)

TABLE 6 Sequence Information for Strains Described in Example 3 andTables 7 and 8 OLS Nucleotide Sequence OLS Protein Sequence Strain (SEQID NO) (SEQ ID NO) t527340 62 5 t527346 38 7 t599285 172 17 t598244 173143 t598490 174 144 t598916 175 145 t598301 176 7 t598212 177 146t598424 178 147 t598578 179 148 t598836 180 149 t597770 181 150 t597768182 151 t599210 183 152 t597806 184 153 t598184 185 154 t598084 186 6t598989 187 155 t598609 188 156 t598907 189 157 t598159 190 158 t598607191 159 t598132 192 160 t598202 193 161 t598224 194 162 t598242 195 163t598265 196 13 t598502 197 164 t598669 198 165 t598828 199 166 t598888200 167 t598890 201 168 t598897 202 169 t598965 203 170 t598976 204 8t599231 205 15 t599271 206 171

TABLE 7 Production of Olivetol and Olivetolic Acid in Secondary Screenof Full-Length OLS Library (with sodium hexanoate supplementation)Standard Standard Average Deviation Average Standard Average DeviationOlivetolic Olivetolic Normalized Deviation Olivetol Olivetol Acid AcidOlivetol Normalized Strain Strain type [ug/L] [ug/L] [ug/L] [ug/L] (perOD) Olivetol t527338 GFP Negative 0 0 0 0 0 0 Control t527340 CannabisOLS 19907.37 3392.375 464.2222 119.0458 1 0.17338 Positive Controlt527346 Library 47674.72 7310.215 1126.656 275.7896 2.577378 0.5328t599285 Library 40719.68 5413.028 1006.4 205.3059 1.570539 0.12684t598244 Library 30067.48 2678.216 1328.625 391.3286 2.322177 0.24813t598490 Library 29983.43 7816.659 895.1 368.4323 1.423788 0.286677t598916 Library 23515.18 1702.502 680.575 66.45968 0.938026 0.042088t598301 Library 21070.73 12453.97 512.175 274.8916 0.881438 0.570619t598212 Library 19864.23 6981.826 582.3 129.3529 1.315837 0.25006t598424 Library 18263.73 3738.191 661.325 148.2432 0.782173 0.157387t598578 Library 18167.93 1534.837 614.4 62.65115 0.733176 0.065004t598836 Library 17825.58 3298.654 614.75 139.553 0.67585 0.216607t597770 Library 16611.23 2805.423 565.575 87.49584 1.203593 0.740925t597768 Library 16140.08 2088.271 469.65 58.01394 0.72298 0.123509t599210 Library 3019.65 187.1009 4913.925 344.6286 0.0939 0.012992t597806 Library 1452.425 194.0261 872.65 58.60117 0.096846 0.040743t598184 Library 466.15 76.41424 6016.625 5727.47 0.033016 0.009583t598084 Library 298.6 38.96913 711.85 86.71242 0.012889 0.003159 t598989Library 192.725 3.557504 981.225 924.6046 0.008438 0.000605 t598609Library 97.025 65.64777 490.925 484.0728 0.003307 0.002213 t598907Library 97 66.37625 539.75 809.4711 0.004664 0.0034 t598159 Library73.825 50.2624 1014.55 98.44669 0.003517 0.002408 t598607 Library 57.967.11512 1006.4 227.8435 0.0017 0.001963

TABLE 8 Production of Olivetol and Olivetolic Acid in Secondary Screenof Full-Length OLS Library (without sodium hexanoate supplementation)Standard Standard Average Deviation Average Standard Average DeviationOlivetolic Olivetolic Normalized Deviation Olivetol Olivetol Acid AcidOlivetol Normalized Strain Strain type [ug/L] [ug/L] [ug/L] [ug/L] (perOD) Olivetol t527338 GFP Negative 0 0 0 0 0 0 Control t527340 CannabisOLS 233.5102 24.98585 0.139276 0.5909 1 0.15846 Positive Control t527346Library 726.4307 68.37505 0.431449 1.830482 3.072299 0.594866 t597768Library 313.8253 41.20191 0 0 1.887013 0.355292 t597770 Library 600.91984.71989 0 0 2.958839 0.883218 t598084 Library 41.65336 27.90324143.5728 17.13177 0.203694 0.149696 t598132 Library 430.7581 98.35626 00 2.045727 0.229508 t598202 Library 629.8948 43.73374 0 0 2.3193670.145214 t598212 Library 439.2199 23.32112 0 0 2.133233 0.157933 t598224Library 535.1348 45.55404 4.682865 1.486543 2.001014 0.151286 t598242Library 444.4074 36.07498 0 0 2.509074 0.247446 t598244 Library 780.64610.19476 21.86307 21.3525 4.250859 0.503227 t598265 Library 399.800524.89199 0 0 2.005431 0.247479 t598301 Library 523.4047 10.23296 0 02.980013 0.751203 t598424 Library 1039.723 20.19861 22.23645 1.6061426.95869 1.19209 t598490 Library 1042.654 162.3917 12.97602 6.9175795.733983 1.383711 t598502 Library 649.3324 105.1879 0 0 3.49989 0.626512t598578 Library 666.2518 167.478 0 0 3.35228 0.956232 t598669 Library555.1264 124.4665 0 0 3.070987 0.828156 t598828 Library 362.7059 53.26140 0 1.702596 0.290853 t598836 Library 544.9485 38.11311 0 0 2.5227610.438238 t598888 Library 0 0 1.036494 2.072988 0 0 t598890 Library216.4236 177.1733 0 0 1.067688 0.958823 t598897 Library 303.584149.89216 0 0 1.9597 0.712481 t598916 Library 528.2121 40.26965 3.1663386.332675 2.217625 0.308839 t598965 Library 259.0065 16.53317 0 0 1.828910.368218 t598976 Library 262.3653 15.67957 0 0 1.252384 0.144212 t599210Library 137.3035 10.28164 453.0067 13.07732 0.545126 0.127023 t599231Library 575.119 59.46478 0 0 2.438902 0.437276 t599271 Library 644.809574.40164 0 0 2.902135 0.475949 t599285 Library 670.6119 86.9222 0 02.154638 0.264646

TABLE 9 Results of Secondary Screen of C. Sativa OLS Protein EngineeringLibrary Amino Acid Standard mutations from Standard Average Deviationwild-type Average Deviation Olivetolic Olivetolic Cannabis proteinOlivetol Olivetol Acid Acid Strain Strain type (SEQ ID NO: 5) [ug/L][ug/L] [ug/L] [ug/L] t346317 GFP 0 0 0 0 Negative Ctrl t405417 LibraryT335C 29155.65 1352.507 925.5075 84.12739 t404953 Library S334P 16467.142021.617 473.085 68.7974 t405220 Library Y153G 15190.5 1885.253 437.932584.3569 t404192 Library I284Y 14380.39 1956.468 390.18 39.47375 t404323Library K100L 14246.96 544.9192 456.55 35.50002 t404196 Library K116R14068.84 2527.921 380.4225 42.26844 t404209 Library I278E 13888.942872.84 439.1325 66.1486 t404164 Library K108D 13824.77 2873.633 292.71197.9307 t404170 Library L348S 13625.61 1648.021 291.5625 195.0828t404384 Library K71R 13619.49 3039.582 372.775 42.82276 t405397 LibraryV92G 13537.85 363.1012 414.8725 26.8083 t405164 Library T128V 13374.171328.433 385.2925 59.52873 t404191 Library K100M 13326.69 1006.193280.65 188.8515 t405340 Library Y135V 13234.48 1441.185 393.945 57.71197t404421 Library P229A 13099.17 2790.466 280.175 190.3239 t404631 LibraryL241I 13096.36 2072.187 425.4825 86.29982 t405133 Library T128A 13050.311267.354 408.1175 30.00599 t405081 Library T128I 12839.46 770.8077409.595 60.20914 t404898 Library S334A 12549.31 2014.497 392.02 20.58724t405017 Library S326R 12437.99 1793.811 291.3725 198.041 t405140 LibraryA125S 12379.56 2038.247 579.6825 87.43007 t404276 Library I273V 12341.81673.6841 344.5 243.9399 t404405 Library K51R 12305.55 2024.022 401.032552.97083 t405079 Library H328Y 11965.56 636.541 380.795 28.50397 t404978Library F64Y 11905.29 408.7996 208.145 241.0289 t405347 Library T17K11875.39 1484.666 353.8775 41.08276 t404855 Library I207L 11774.751252.291 380.1475 30.28471 t405362 Library V324I 11489.89 2109.012350.79 78.35325 t404523 Library L25R 11375.9 2136.219 212.6875 248.5159t404951 Library T296A 11266.41 958.6764 269.73 181.7056 t405308 LibraryD320A 11140.11 1318.646 346.7825 18.1635 t405201 Library V307I 11054.693152.4 194.21 241.925 t404219 Library I23C 11046.19 1061.09 380.34525.59692 t404673 Library M267K 11004.45 2014.531 382.22 111.9515 t404274Library L277M 10942.46 930.1003 360.4375 80.2903 t405042 Library T123C10940.26 899.6448 314.7975 212.1934 t404528 Library M267G 10521.732186.36 260.765 189.9513 t405312 Library E196K 10503.49 1491.483321.1425 43.88772 t404725 Library R375T 10474.14 637.7002 181.37209.4388 t405303 Library T247A 10412.99 1484.527 353.6075 15.69949t405395 Library V95A 10397.39 1215.177 299.155 49.56504 t405326 LibraryD54R 10116.47 1327.968 316.9625 42.33512 t404599 Library L201C 10033.371349.986 368.615 82.31389

TABLE 10A Results of Secondary Screen of Protein Engineering LibraryUsing C. sativa, Cochorus, and Cymbidium Templates (supplemented withsodium hexanoate) Standard Standard Average Deviation Wild-type AminoAcid Average Deviation Olivetolic Olivetolic template mutations fromOlivetol Olivetol Acid Acid Strain used wild-type [ug/L] [ug/L] [ug/L][ug/L] t527338 22.72917 79.04841 0 0 GFP t606794 Cannabis wild-type15598.47 5507.375 337.3906 163.6388 sativa (Hemp) (Marijuana) t527340Cannabis wild-type 15534.95 6243.926 359.0063 194.9093 Cannabis sativaOLS t607067 Cannabis F367L 10907.1 3397.082 310.75 114.4806 sativat607367 Cannabis G366A 8394.975 2661.736 137.95 44.65143 sativa t607391Cannabis P298N 8381.075 1668.07 167.15 37.8978 sativa t606801 CannabisS334P 22691 2379.928 572.025 55.64575 sativa t606984 Cannabis I248M384.975 256.6688 124.475 6.990649 sativa t606899 Cannabis S332W 55.1110.2 19.4 38.8 sativa t606797 Corchorus wild-type 448.8643 190.2768665.7071 222.9162 Corchorus olitorius OLS t606807 Corchorus d1-8 Y142C8173.95 11559.71 639.55 114.0563 olitorius t607179 Corchorus Y301W1487.6 1289.575 528.325 51.91142 olitorius V302I V303T N305P P308K T309At607149 Corchorus d1-8 W339S 925.4 1638.079 108.825 35.84888 olitoriust607139 Corchorus d1-8 Y266F 539.55 467.458 383.3 31.26137 olitoriust607112 Corchorus W339S 517.075 737.3064 81.225 37.55896 olitoriust607332 Corchorus d1-8 408.25 41.92648 393.6 66.39794 olitorius Y301WV302I V303T N305P P308K T309A t607153 Corchorus d1-8 A373G 334.3519.09633 529.5 33.9143 olitorius t607158 Corchorus A373G 314.8 15.06143423.9 57.00158 olitorius t607236 Corchorus M255I 315.325 23.9742 360.47.086607 olitorius t607141 Corchorus d1-8 Y266F 168.575 337.15 69.3258.448422 olitorius W339S t607176 Corchorus L374F 130.45 150.9282 97.056.728298 olitorius t606930 Corchorus d1-8 M255I 265.85 306.9972 735.653.30647 olitorius t607193 Corchorus d1-8 T12Y 65.675 131.35 39.328.53571 olitorius F39Y Q42R L43A Q47E Q51D Q57K I77L G79E S84C E96DT100E L121K N123K A135V A137M T139G H143Q N146K K151R H152P K156R F158MS174A V182R D183G S184A N231T K232N I241V T253D C260G M287E M353R Q357ES395N t607006 Corchorus Y266F 265.975 178.2236 505.325 21.29575olitorius t606993 Corchorus d1-8 N305P 165.875 191.5369 427.15 17.27937olitorius t606852 Corchorus N305P 73.375 146.75 404.3 58.06772 olitoriust607119 Corchorus d1-8 L374F 0 0 111.725 18.69962 olitorius t607371Corchorus Y266F 0 0 37.25 3.421988 olitorius W339S t527346 Cymbidiumwild-type 29779.86 10784.18 631.1694 320.6464 Cymbidiu hybrid m OLScultivar t606952 Cymbidium V71Y 40374.05 3947.169 1177.275 107.3831hybrid cultivar t607284 Cymbidium F70M 18119.03 1257.566 374.85 15.30109hybrid cultivar t607262 Cymbidium L385M 18869.45 2300.831 394.42543.38251 hybrid cultivar t606938 Cymbidium D88A 30129.05 421.2235 773.117.39483 hybrid cultivar t607260 Cymbidium E285A 15646.4 1341.912 319.121.50364 hybrid cultivar t607159 Cymbidium L76I 25322 7151.452 633.5130.6911 hybrid cultivar t606946 Cymbidium N151P 29738.73 5548.193 778.9116.4196 hybrid cultivar t606861 Cymbidium E203K 44399.18 12437.51252.525 394.7546 hybrid cultivar t606918 Cymbidium V50N 27251.281711.322 732.125 88.38052 hybrid cultivar t607135 Cymbidium E28P24306.43 6961.951 615.45 163.6961 hybrid cultivar t607286 Cymbidium S34Q13463.55 871.8021 287.675 21.0191 hybrid cultivar t606942 CymbidiumR100P 34937.75 2136.806 968.7 20.64752 hybrid cultivar t606959 CymbidiumA219C 32778.4 1462.567 812.475 54.31368 hybrid cultivar t607294Cymbidium K359M 13309.98 473.943 279 10.60157 hybrid cultivar t607282Cymbidium R100T 13723.58 1657.273 282.825 20.36768 hybrid cultivart607230 Cymbidium E116D 21772.93 983.0294 469.875 34.1937 hybridcultivar t606965 Cymbidium Y142V 26443.03 1993.229 918 96.78736 hybridcultivar t607288 Cymbidium T289D 13669.95 750.4718 290.775 34.8278hybrid cultivar t607228 Cymbidium M135I 21348.6 702.6819 604.6 19.56238hybrid cultivar t606909 Cymbidium W368H 40713.38 2212.329 998.9550.99526 hybrid cultivar t606962 Cymbidium D229E 28238.98 1771.039765.65 43.92285 hybrid cultivar t607150 Cymbidium E285K 25386.952387.108 593.6 71.86895 hybrid cultivar t607361 Cymbidium E323Q 20209.883057.447 378.4 72.17622 hybrid cultivar t606932 Cymbidium S18T 26673.384632.723 726.825 106.151 hybrid cultivar t606940 Cymbidium A13S 29094.853024.508 737.1 40.72935 hybrid cultivar t607269 Cymbidium A333R 12947.23883.2885 268.075 35.74207 hybrid cultivar t607186 Cymbidium S180N23191.9 4783.221 594.2 135.8793 hybrid cultivar t607476 Cymbidium L20F17200 1998.783 350.375 25.15451 hybrid cultivar t607031 Cymbidium N80H25388.63 6003.712 611.825 130.4566 hybrid cultivar t606916 CymbidiumR100A 25282.83 1244.898 638.4 21.58997 hybrid cultivar t607292 CymbidiumV331I 12013.93 2802.682 241.875 58.81765 hybrid cultivar t606908Cymbidium I22M 32699.68 21947.9 1120.825 82.20671 hybrid cultivart607248 Cymbidium E155C 20670.43 1761.502 477.55 41.49647 hybridcultivar t607023 Cymbidium V71H 27924.45 1393.419 761.2 34.79224 hybridcultivar t606936 Cymbidium T111K 26953.6 3174.107 689.975 120.0151hybrid cultivar t607433 Cymbidium L291V 18071.78 6905.44 351.525142.8462 hybrid cultivar t607600 Cymbidium R123A 18038.33 2576.146333.575 50.84194 hybrid cultivar t606894 Cymbidium Y142F 36097.632137.695 911.05 54.27403 hybrid cultivar t606963 Cymbidium I147E 26509.61811.909 657.275 14.62882 hybrid cultivar t607603 Cymbidium Y142118913.93 1120.707 596.475 92.27106 hybrid cultivar t607452 CymbidiumK54E 16597 3291.995 329.3 60.58185 hybrid cultivar t607197 CymbidiumG84C 19663.7 884.8279 459.3 27.12354 hybrid cultivar t606996 CymbidiumT170A 39067.8 2354.185 1157.575 79.54543 hybrid cultivar t607043Cymbidium N45T 28747.38 934.8057 679.875 21.52431 hybrid cultivart607254 Cymbidium G262T 11775.15 1040.26 246.425 18.13861 hybridcultivar t607478 Cymbidium N11R 15894.55 2710.618 324.075 39.31619hybrid cultivar t607132 Cymbidium D208A 24881.38 1936.584 765.7560.52451 hybrid cultivar t607109 Cymbidium D61R 25634.95 2078.231613.375 100.6135 hybrid cultivar t607155 Cymbidium C176D 22283.453073.579 526.325 67.26383 hybrid cultivar t606956 Cymbidium L83M23728.73 2145.982 656.075 79.72258 hybrid cultivar t606906 CymbidiumE28A 33411.15 939.6742 850.55 34.71894 hybrid cultivar t607195 CymbidiumT111R 17676.68 1327.959 403.725 50.31689 hybrid cultivar t607449Cymbidium E203P 16032.95 2469.127 317.75 28.30789 hybrid cultivart607256 Cymbidium G373A 12090.73 566.8141 281.125 10.12073 hybridcultivar t607349 Cymbidium K282D 18699.3 3173.399 354.175 88.50621hybrid cultivar t606960 Cymbidium N78R 25400.9 667.3651 660.875 37.90764hybrid cultivar t607601 Cymbidium K121V 18381.8 4015.223 314.125 78.4602hybrid cultivar t607021 Cymbidium H69Y 24261.93 2128.687 571.075 61.0977hybrid cultivar t606874 Cymbidium D208S 28092.05 2956.394 886.05199.1175 hybrid cultivar t607320 Cymbidium T111E 16500.03 4257.055303.275 77.26765 hybrid cultivar t607317 Cymbidium C82E 17852.454825.826 325.225 76.27321 hybrid cultivar t607224 Cymbidium Y142C19142.53 1455.367 540.25 54.21098 hybrid cultivar t606912 Cymbidium D14P31141.75 1149.49 769.2 65.33912 hybrid cultivar t607602 Cymbidium R100E17901.3 2319.379 328.5 67.41187 hybrid cultivar t606905 Cymbidium V388A29890.83 2522.115 747.45 83.27435 hybrid cultivar t607156 CymbidiumH269S 21143.2 1008.997 527.5 30.55083 hybrid cultivar t607474 CymbidiumG262A 15068.18 1372.284 326.075 28.24894 hybrid cultivar t607482Cymbidium R24K 14858.45 2175.705 313.6 58.93963 hybrid cultivar t606854Cymbidium L144I 35182.43 2487.935 921.625 54.02563 hybrid cultivart607032 Cymbidium K355S 24649.8 1675.56 544.4 16.12203 hybrid cultivart606830 Cymbidium M135V 34299.15 4732.074 977.675 108.3528 hybridcultivar t606961 Cymbidium V43M 25357.95 678.9486 646.975 24.3909 hybridcultivar t606868 Cymbidium L248I 35144.45 1993.999 945.425 66.1703hybrid cultivar t607083 Cymbidium G84S 21707.73 2407.367 537.85 68.42819hybrid cultivar t606958 Cymbidium C82N 22576.8 2610.997 564.75 61.60793hybrid cultivar t607273 Cymbidium V341P 10381.15 857.4199 209.32525.9962 hybrid cultivar t607241 Cymbidium V388T 17754.23 1010.578444.275 22.37668 hybrid cultivar t606857 Cymbidium A195V 36686.032848.052 937.825 116.369 hybrid cultivar t606901 Cymbidium D208C 301541882.794 883.325 84.4674 hybrid cultivar t607015 Cymbidium G84T 23550.91835.311 559.725 25.30183 hybrid cultivar t607586 Cymbidium I326R13845.4 1980.772 297.075 55.00051 hybrid cultivar t607122 CymbidiumF374L 20425.3 2531.711 488.175 62.87863 hybrid cultivar t606882Cymbidium L222I 27445.5 1536.232 692.15 54.08453 hybrid cultivar t607146Cymbidium G262T 18830.75 1157.57 440.025 51.65213 hybrid cultivart607585 Cymbidium L83I 13348.68 1378.214 274.425 30.01082 hybridcultivar t606828 Cymbidium A107M 32613.03 1865.271 860.575 52.56104hybrid cultivar t607081 Cymbidium I99G 20827 2377.691 521.525 35.96761hybrid cultivar t606887 Cymbidium Q274G 27128.45 1010.786 675.4525.36697 hybrid cultivar t606891 Cymbidium K356Q 26852.25 420.5503655.325 41.67072 hybrid cultivar t607160 Cymbidium V341A 19361.231205.165 469.725 29.27984 hybrid cultivar t607194 Cymbidium F86Y17323.08 6000.996 483.975 161.5325 hybrid cultivar t607377 CymbidiumS339W 0 0 46 22.64553 hybrid cultivar t607265 Cymbidium E28P F40Y5858.45 955.8912 165.625 31.40938 hybrid N51D K54E cultivar N80H C82EL83I G84C F86Y D88A N89P N92D F97M I99R T111Q R123K L137M I147L N151PC176D S180N K182P Q206L H233N S237F A252E S260G G262A G281E T289K D367Et607000 Cymbidium G373A 16989.08 11567.45 700.525 192.4804 hybrid F374Lcultivar t607245 Cymbidium E28P G84C 8507.025 5710.02 210.4 141.3894hybrid N89P cultivar T111R N151P C176D Q206L S237F A252E G262T G281Et607318 Cymbidium E28P F40Y 6864.125 789.6743 243.25 102.741 hybrid N51DK54E cultivar N80H C82E L83I G84C F86Y D88A N89P N92D F97M I99R T111ER123K L137M I147L N151P C176D S180N K182P Q206L H233N S237F A252E S260GG262A G281E T289K D367E t607435 Cymbidium E28P N51D 4888.425 3390.097163.75 117.0815 hybrid K54E N80H cultivar C82E L83I G84C F86Y D88A N89PN92D F97M I99R T111E R123K L137M I147L N151P C176D S180N K182P Q206LH233N S237F A252E S260G G262A G281E T289K D367E t607337 Cymbidium E28PN51E 5250.525 764.6064 135.05 26.46186 hybrid N80H C82E cultivar L83IG84C F86Y D88A N89P N92D F97M I99R T111R R123K L137M I147L N151P C176DS180N K182P Q206L H233N S237F A252E S260G G262A G281E T289K D367Et607316 Cymbidium W301Y 5549.375 462.0676 169 15.9066 hybrid P303Vcultivar P305N R308P A309T t607124 Cymbidium I255M hybrid A266Y 0 0 0 0cultivar W301Y P303V P305N R308P A309T S339W V341P G373A F374L t607280Cymbidium A266Y 0 0 0 0 hybrid P305N cultivar S339W V341P G373A F374Lt607290 Cymbidium A266Y 0 0 0 0 hybrid S339W cultivar t607381 CymbidiumA266Y 0 0 0 0 hybrid P305N cultivar S339W

TABLE 10B Results of Secondary Screen of Protein Engineering LibraryUsing C. sativa, Cochorus, and Cymbidium Templates (supplemented withsodium hexanoate) Average Standard Olivetol Deviation Average StandardNormalized to Olivetol Wild-type Amino Acid Normalized Deviationt606797_Cor- Normalized to template mutations from Olivetol Normalizedchorus OLS t606797_Cor- Strain used wild-type (per OD) Olivetol (per OD)chorus OLS t527338 0.001903 0.006512 GFP t606794 Cannabis wild-type1.166072 0.265739 sativa (Hemp) (Marijuana) t527340 Cannabis wild-type1.02159 0.125216 Cannabis sativa OLS t607067 Cannabis F367L 0.7612160.189784 sativa t607367 Cannabis G366A 0.745615 0.241737 sativa t607391Cannabis P298N 0.708056 0.110021 sativa t606801 Cannabis S334P 0.6748940.108203 sativa t606984 Cannabis I248M 0.00995 0.006837 sativa t606899Cannabis S332W 0.001802 0.003605 sativa t606797 Corchorus wild-type0.027033 0.012052 1 0.520764 Corchorus olitorius OLS t606807 Corchorusd1-8 Y142C 0.275531 0.38966 29.9306 42.32826 olitorius t607179 CorchorusY301W V302I 0.103413 0.091496 2.842431 2.514883 olitorius V303T N305PP308K T309A t607149 Corchorus d1-8 W339S 0.075566 0.133268 2.0770173.663029 olitorius t607139 Corchorus d1-8 Y266F 0.038008 0.0339981.044701 0.934465 olitorius t607112 Corchorus W339S 0.033504 0.0505861.29632 1.957247 olitorius t607332 Corchorus d1-8 Y301W 0.027999 0.004410.919863 0.144887 olitorius V302I V303T N305P P308K T309A t607153Corchorus d1-8 A373G 0.025872 0.001554 0.711133 0.042715 olitoriust607158 Corchorus A373G 0.0221 0.001298 0.607436 0.035688 olitoriust607236 Corchorus M255I 0.019521 0.001797 0.701647 0.064597 olitoriust607141 Corchorus d1-8 Y266F 0.012974 0.025949 0.356615 0.71323olitorius W339S t607176 Corchorus L374F 0.010369 0.012076 0.2850130.33193 olitorius t606930 Corchorus d1-8 M255I 0.009039 0.0107520.948726 1.128513 olitorius t607193 Corchorus d1-8 T12Y 0.0056130.011226 0.154285 0.30857 olitorius F39Y Q42R L43A Q47E Q51D Q57K I77LG79E S84C E96D T100E L121K N123K A135V A137M T139G H143Q N146K K151RH152P K156R F158M S174A V182R D183G S184A N231T K232N I241V T253D C260GM287E M353R Q357E S395N t607006 Corchorus Y266F 0.004498 0.0030840.58364 0.400247 olitorius t606993 Corchorus d1-8 N305P 0.0039330.004545 0.510385 0.58981 olitorius t606852 Corchorus N305P 0.0017140.003429 0.186231 0.372463 olitorius t607119 Corchorus d1-8 L374F 0 0 00 olitorius t607371 Corchorus Y266F W339S 0 0 0 0 olitorius t527346Cymbidium wild-type 0.90183 0.165441 Cymbidium hybrid cultivar OLSt606952 Cymbidium V71Y 0.967928 0.215533 hybrid cultivar t607284Cymbidium F70M 0.949145 0.075505 hybrid cultivar t607262 Cymbidium L385M0.890722 0.082131 hybrid cultivar t606938 Cymbidium D88A 0.871 0.251207hybrid cultivar t607260 Cymbidium E285A 0.806964 0.103286 hybridcultivar t607159 Cymbidium L76I 0.760304 0.251978 hybrid cultivart606946 Cymbidium N151P 0.754797 0.320899 hybrid cultivar t606861Cymbidium E203K 0.745482 0.268545 hybrid cultivar t606918 Cymbidium V50N0.734491 0.069332 hybrid cultivar t607135 Cymbidium E28P 0.7312580.315887 hybrid cultivar t607286 Cymbidium S34Q 0.716576 0.071466 hybridcultivar t606942 Cymbidium R100P 0.712487 0.038444 hybrid cultivart606959 Cymbidium A219C 0.709144 0.063379 hybrid cultivar t607294Cymbidium K359M 0.707812 0.023224 hybrid cultivar t607282 CymbidiumR100T 0.703105 0.073566 hybrid cultivar t607230 Cymbidium E116D 0.6948140.046965 hybrid cultivar t606965 Cymbidium Y142V 0.690425 0.129346hybrid cultivar t607288 Cymbidium T289D 0.68465 0.060985 hybrid cultivart607228 Cymbidium M135I 0.682179 0.118232 hybrid cultivar t606909Cymbidium W368H 0.676488 0.062146 hybrid cultivar t606962 CymbidiumD229E 0.675087 0.074269 hybrid cultivar t607150 Cymbidium E285K 0.6644250.076686 hybrid cultivar t607361 Cymbidium E323Q 0.663279 0.09616 hybridcultivar t606932 Cymbidium S18T 0.660218 0.135435 hybrid cultivart606940 Cymbidium A13S 0.657782 0.010521 hybrid cultivar t607269Cymbidium A333R 0.652497 0.051357 hybrid cultivar t607186 CymbidiumS180N 0.649029 0.165224 hybrid cultivar t607476 Cymbidium L20F 0.6478730.057907 hybrid cultivar t607031 Cymbidium N80H 0.629308 0.25658 hybridcultivar t606916 Cymbidium R100A 0.628769 0.063422 hybrid cultivart607292 Cymbidium V331I 0.628464 0.157777 hybrid cultivar t606908Cymbidium I22M 0.624153 0.422076 hybrid cultivar t607248 Cymbidium E155C0.622639 0.035079 hybrid cultivar t607023 Cymbidium V71H 0.6205390.024017 hybrid cultivar t606936 Cymbidium T111K 0.620218 0.070587hybrid cultivar t607433 Cymbidium L291V 0.61928 0.21641 hybrid cultivart607600 Cymbidium R123A 0.617083 0.084619 hybrid cultivar t606894Cymbidium Y142F 0.617067 0.042344 hybrid cultivar t606963 CymbidiumI147E 0.616751 0.065014 hybrid cultivar t607603 Cymbidium Y142I 0.6157760.03639 hybrid cultivar t607452 Cymbidium K54E 0.609431 0.073608 hybridcultivar t607197 Cymbidium G84C 0.607758 0.04262 hybrid cultivar t606996Cymbidium T170A 0.606281 0.062689 hybrid cultivar t607043 Cymbidium N45T0.601817 0.034499 hybrid cultivar t607254 Cymbidium G262T 0.5999110.07071 hybrid cultivar t607478 Cymbidium N11R 0.593189 0.06751 hybridcultivar t607132 Cymbidium D208A 0.592492 0.112514 hybrid cultivart607109 Cymbidium D61R 0.591253 0.021319 hybrid cultivar t607155Cymbidium C176D 0.589853 0.117858 hybrid cultivar t606956 Cymbidium L83M0.588247 0.063594 hybrid cultivar t606906 Cymbidium E28A 0.5881710.068304 hybrid cultivar t607195 Cymbidium T111R 0.587434 0.080522hybrid cultivar t607449 Cymbidium E203P 0.583681 0.065069 hybridcultivar t607256 Cymbidium G373A 0.582906 0.069959 hybrid cultivart607349 Cymbidium K282D 0.581651 0.053625 hybrid cultivar t606960Cymbidium N78R 0.571233 0.098482 hybrid cultivar t607601 Cymbidium K121V0.569034 0.179725 hybrid cultivar t607021 Cymbidium H69Y 0.5636250.026338 hybrid cultivar t606874 Cymbidium D208S 0.562002 0.072513hybrid cultivar t607320 Cymbidium T111E 0.561861 0.128738 hybridcultivar t607317 Cymbidium C82E 0.559312 0.140085 hybrid cultivart607224 Cymbidium Y142C 0.557739 0.033377 hybrid cultivar t606912Cymbidium D14P 0.557253 0.024918 hybrid cultivar t607602 Cymbidium R100E0.553566 0.029107 hybrid cultivar t606905 Cymbidium V388A 0.5527550.038488 hybrid cultivar t607156 Cymbidium H269S 0.552698 0.060754hybrid cultivar t607474 Cymbidium G262A 0.55226 0.080327 hybrid cultivart607482 Cymbidium R24K 0.551229 0.059821 hybrid cultivar t606854Cymbidium L144I 0.550953 0.054635 hybrid cultivar t607032 CymbidiumK355S 0.543176 0.043908 hybrid cultivar t606830 Cymbidium M135V 0.5428870.080821 hybrid cultivar t606961 Cymbidium V43M 0.535062 0.061363 hybridcultivar t606868 Cymbidium L248I 0.532705 0.006774 hybrid cultivart607083 Cymbidium G84S 0.529811 0.024702 hybrid cultivar t606958Cymbidium C82N 0.525334 0.152553 hybrid cultivar t607273 Cymbidium V341P0.522317 0.046732 hybrid cultivar t607241 Cymbidium V388T 0.5190250.016822 hybrid cultivar t606857 Cymbidium A195V 0.518645 0.087321hybrid cultivar t606901 Cymbidium D208C 0.51686 0.063709 hybrid cultivart607015 Cymbidium G84T 0.515084 0.042136 hybrid cultivar t607586Cymbidium I326R 0.514156 0.080245 hybrid cultivar t607122 CymbidiumF374L 0.513354 0.048872 hybrid cultivar t606882 Cymbidium L222I 0.5101840.011174 hybrid cultivar t607146 Cymbidium G262T 0.506767 0.057007hybrid cultivar t607585 Cymbidium L83I 0.506322 0.049736 hybrid cultivart606828 Cymbidium A107M 0.505604 0.006906 hybrid cultivar t607081Cymbidium I99G 0.504811 0.041631 hybrid cultivar t606887 Cymbidium Q274G0.50406 0.082289 hybrid cultivar t606891 Cymbidium K356Q 0.5015440.043785 hybrid cultivar t607160 Cymbidium V341A 0.501469 0.056063hybrid cultivar t607194 Cymbidium F86Y 0.500062 0.184291 hybrid cultivart607377 Cymbidium S339W 0 0 hybrid cultivar t607265 Cymbidium E28P F40Y0.298621 0.029421 hybrid cultivar N51D K54E N80H C82E L83I G84C F86YD88A N89P N92D F97M I99R T111Q R123K L137M I147L N151P C176D S180N K182PQ206L H233N S237F A252E S260G G262A G281E T289K D367E t607000 CymbidiumG373A F374L 0.279083 0.189883 hybrid cultivar t607245 Cymbidium E28PG84C 0.246285 0.179183 hybrid cultivar N89P T111R N151P C176D Q206LS237F A252E G262T G281E t607318 Cymbidium E28P F40Y 0.205093 0.035905hybrid cultivar N51D K54E N80H C82E L83I G84C F86Y D88A N89P N92D F97MI99R T111E R123K L137M I147L N151P C176D S180N K182P Q206L H233N S237FA252E S260G G262A G281E T289K D367E t607435 Cymbidium E28P N51D 0.1819140.136726 hybrid cultivar K54E N80H C82E L83I G84C F86Y D88A N89P N92DF97M I99R T111E R123K L137M I147L N151P C176D S180N K182P Q206L H233NS237F A252E S260G G262A G281E T289K D367E t607337 Cymbidium E28P N51E0.163476 0.021163 hybrid cultivar N80H C82E L83I G84C F86Y D88A N89PN92D F97M I99R T111R R123K L137M I147L N151P C176D S180N K182P Q206LH233N S237F A252E S260G G262A G281E T289K D367E t607316 Cymbidium W301Y0.161295 0.013439 hybrid cultivar P303V P305N R308P A309T t607124Cymbidium I255M A266Y 0 0 hybrid cultivar W301Y P303V P305N R308P A309TS339W V341P G373A F374L t607280 Cymbidium A266Y P305N 0 0 hybridcultivar S339W V341P G373A F374L t607290 Cymbidium A266Y 0 0 hybridcultivar S339W t607381 Cymbidium A266Y P305N 0 0 hybrid cultivar S339W

TABLE 11A Results of Secondary Screen of Protein Engineering LibraryUsing C. sativa, Cochorus, and Cymbidium Templates (not supplementedwith sodium hexanoate) Standard Standard Average Deviation Wild-typeAmino Acid Average Deviation Olivetolic Olivetolic template mutationsfrom Olivetol Olivetol Acid Acid Strain used wild-type [ug/L] [ug/L][ug/L] [ug/L] t527338 GFP 0.620833 3.04145 0 0 t527340 Cannabiswild-type 127.8438 14.41426 0 0 Cannabis sativa OLS t606801 CannabisS334P 220.05 20.91355 0 0 sativa (Hemp) (Marijuana) t607067 CannabisF367L 160.1 15.69777 0 0 sativa (Hemp) (Marijuana) t607367 CannabisG366A 142.075 3.482695 0 0 sativa (Hemp) (Marijuana) t606794 Cannabiswild-type 150.8156 12.9258 0 0 sativa (Hemp) (Marijuana) t607391Cannabis P298N 82.95 5.945587 0 0 sativa (Hemp) (Marijuana) t606984Cannabis I248M 14.025 0.745542 0 0 sativa (Hemp) (Marijuana) t606899Cannabis S332W 0 0 0 0 sativa (Hemp) (Marijuana) t606797 Corchoruswild-type 39.93571 6.066497 128.175 20.00325 Corchorus olitorius OLSt606807 Corchorus d1-8 Y142C 170.75 207.3944 44.3 26.72864 olitoriust606930 Corchorus d1-8 M255I 43.925 4.811358 126.175 8.854519 olitoriust607179 Corchorus Y301W 40.725 8.259288 111.775 6.540833 olitorius V302IV303T N305P P308K T309A t607332 Corchorus d1-8 Y301W 43.375 4.289814143.675 17.99711 olitorius V302I V303T N305P P308K T309A t607236Corchorus M255I 38.825 0.869387 101.25 4.184495 olitorius t607006Corchorus Y266F 25.425 1.011187 84.475 1.575595 olitorius t606993Corchorus d1-8 N305P 24.35 1.053565 77.375 1.817278 olitorius t607139Corchorus d1-8 Y266F 21.525 2.692428 62.375 2.57083 olitorius t607158Corchorus A373G 17.725 1.705628 46.75 5.257059 olitorius t607153Corchorus d1-8 A373G 16.875 1.327592 51.575 2.379601 olitorius t606852Corchorus N305P 25.675 1.613227 75.975 4.164433 olitorius t607112Corchorus W339S 0 0 0 0 olitorius t607119 Corchorus d1-8 L374F 0 020.725 0.780491 olitorius t607141 Corchorus d1-8 Y266F 0 0 0 0 olitoriusW339S t607149 Corchorus d1-8 W339S 0 0 0 0 olitorius t607176 CorchorusL374F 0 0 17.975 0.684957 olitorius t607193 Corchorus d1-8 T12Y 0 0 0 0olitorius F39Y Q42R L43A Q47E Q51D Q57K I77L G79E S84C E96D T100E L121KN123K A135V A137M T139G H143Q N146K K151R H152P K156R F158M S174A V182RD183G S184A N231T K232N I241V T253D C260G M287E M353R Q357E S395Nt607371 Corchorus Y266F 0 0 0 0 olitorius W339S t527346 Cymbidiumwild-type 406.4417 80.3967 3.658333 1.293141 Cymbidium hybrid OLScultivar t607221 Cymbidium N92D 509.9 0 6.5 0.424264 hybrid cultivart607228 Cymbidium M135I 524.425 26.05281 6.825 0.861684 hybrid cultivart606878 Cymbidium P303A 632.95 28.30365 8.125 5.596055 hybrid cultivart606986 Cymbidium E323G 551.95 42.85468 8.6 1.411855 hybrid cultivart606999 Cymbidium N151P 508.3 23.17873 8.4 0.365148 hybrid cultivart607224 Cymbidium Y142C 493.525 14.39546 7.725 0.556028 hybrid cultivart606976 Cymbidium Q274K 541.475 26.72569 6.525 4.353064 hybrid cultivart607241 Cymbidium V388T 455.325 14.62244 5.75 0.420317 hybrid cultivart607603 Cymbidium Y142I 540.975 14.56237 11.475 0.853913 hybrid cultivart607222 Cymbidium N11K 479.35 10.11163 6.3 0.424264 hybrid cultivart607014 Cymbidium A287M 511.8 26.34755 7.975 1.388944 hybrid cultivart606994 Cymbidium T289A 496.475 22.44614 8.225 0.57373 hybrid cultivart606982 Cymbidium V314I 546.875 39.64899 8.9 1.177568 hybrid cultivart606995 Cymbidium I147L 480.9 23.94006 7.225 0.7932 hybrid cultivart607007 Cymbidium G281E 489.6 12.27436 8.15 0.544671 hybrid cultivart607008 Cymbidium I390Q 491.625 18.42107 6.575 0.125831 hybrid cultivart606965 Cymbidium Y142V 570.85 25.65599 10.1 0.547723 hybrid cultivart607107 Cymbidium A79E 431.575 22.92108 4.575 0.877021 hybrid cultivart607194 Cymbidium F86Y 437.325 4.76891 4.875 0.727438 hybrid cultivart606981 Cymbidium E96R 512.85 23.07993 3.825 4.439501 hybrid cultivart606979 Cymbidium E32R 492 40.898 6.2 1.249 hybrid cultivar t606975Cymbidium K182P 532.825 6.97824 7.75 0.3 hybrid cultivar t607230Cymbidium E116D 461.35 10.56362 5.175 0.499166 hybrid cultivar t607004Cymbidium L291Y 487.475 6.717825 8.25 0.619139 hybrid cultivar t606996Cymbidium T170A 485.075 15.50965 8 0.559762 hybrid cultivar t607046Cymbidium K359R 527.9 18.45842 7.775 0.556028 hybrid cultivar t607043Cymbidium N45T 524.225 15.59837 6.175 1.040433 hybrid cultivar t607021Cymbidium H69Y 530.025 26.2045 6.95 0.635085 hybrid cultivar t607109Cymbidium D61R 446.85 18.05076 5.275 0.873212 hybrid cultivar t607036Cymbidium K321E 521.625 5.105797 8.05 0.331662 hybrid cultivar t606912Cymbidium D14P 538.775 35.94546 8.975 0.910586 hybrid cultivar t607602Cymbidium R100E 465.225 21.39554 5.75 0.83865 hybrid cultivar t607361Cymbidium E323Q 453.35 6.413787 4.775 0.411299 hybrid cultivar t606882Cymbidium L222I 501.55 28.07757 7.6 0.559762 hybrid cultivar t606962Cymbidium D229E 523.325 17.80587 7.225 0.330404 hybrid cultivar t607150Cymbidium E285K 418.225 12.1393 4.225 0.330404 hybrid cultivar t607252Cymbidium F40Y 435.9 14.81486 5.275 0.713559 hybrid cultivar t607225Cymbidium S260G 418.925 13.55221 4.175 0.170783 hybrid cultivar t607032Cymbidium K355S 500.8 16.68772 6.75 1.767767 hybrid cultivar t607248Cymbidium E155C 405.15 11.08708 3.3 1.036018 hybrid cultivar t607155Cymbidium C176D 423 7.037045 4.475 0.095743 hybrid cultivar t606958Cymbidium C82N 536.2 25.07761 7.575 0.634429 hybrid cultivar t607023Cymbidium V71H 511.725 10.15755 7 0.216025 hybrid cultivar t607027Cymbidium T289K 520.675 13.61503 8.1 0.535413 hybrid cultivar t606892Cymbidium Y142T 572.6 39.21003 10.9 0.702377 hybrid cultivar t607035Cymbidium Y160G 513.55 23.95308 11.825 0.93586 hybrid cultivar t607237Cymbidium T289S 424.05 16.8777 4.45 0.556776 hybrid cultivar t607189Cymbidium N51E 420 11.2116 4.55 0.619139 hybrid cultivar t607118Cymbidium T289Q 404.3 9.899495 4.25 0.070711 hybrid cultivar t607018Cymbidium I390N 493.25 17.93813 7.025 1.297112 hybrid cultivar t607045Cymbidium F30C 522.175 29.45996 8 0.930949 hybrid cultivar t606888Cymbidium A107L 521.125 20.05731 6.85 0.404145 hybrid cultivar t607220Cymbidium A13T 432.725 9.603602 4.825 0.206155 hybrid cultivar t606830Cymbidium M135V 568.425 18.5615 7.6 0.432049 hybrid cultivar t606832Cymbidium E155Q 552.975 20.81688 7.225 0.741058 hybrid cultivar t607601Cymbidium K121V 429.25 19.20772 5.025 0.15 hybrid cultivar t606857Cymbidium A195V 607.575 22.90988 8.125 0.567891 hybrid cultivar t607452Cymbidium K54E 436.75 12.72962 5.325 0.623832 hybrid cultivar t607218Cymbidium F97M 451.85 12.7291 5 0.787401 hybrid cultivar t607186Cymbidium S180N 435.525 20.91274 4.8 0.535413 hybrid cultivar t607123Cymbidium S237F 421.475 3.484609 4.875 0.386221 hybrid cultivar t607286Cymbidium S34Q 452.15 17.4135 5.35 0.412311 hybrid cultivar t606918Cymbidium V50N 506.275 26.08453 6.5 0.752773 hybrid cultivar t606916Cymbidium R100A 509.775 27.51986 7.05 0.988264 hybrid cultivar t606990Cymbidium L291W 457.875 21.00831 6.925 1.001249 hybrid cultivar t606908Cymbidium I22M 523.45 43.08074 4.35 5.05536 hybrid cultivar t606963Cymbidium I147E 502.075 20.19032 6.4 0.752773 hybrid cultivar t607226Cymbidium Q115D 435.7 13.37934 5.125 0.359398 hybrid cultivar t606961Cymbidium V43M 537.45 13.96675 6.8 0.355903 hybrid cultivar t607260Cymbidium E285A 454.15 7.783101 5.2 0.244949 hybrid cultivar t607160Cymbidium V341A 405.65 13.91893 4.4 0.541603 hybrid cultivar t607156Cymbidium H269S 415.35 17.57546 3.925 0.464579 hybrid cultivar t607478Cymbidium N11R 448.1 9.988994 5.05 0.772442 hybrid cultivar t606887Cymbidium Q274G 481.45 22.29716 5.7 3.81663 hybrid cultivar t606861Cymbidium E203K 594.35 17.56407 8.25 0.82664 hybrid cultivar t607217Cymbidium D367E 421.1 11.77427 4.375 0.805709 hybrid cultivar t606894Cymbidium Y142F 512.875 39.92705 7.9 0.752773 hybrid cultivar t607288Cymbidium T289D 447.875 12.13875 4.8 0.547723 hybrid cultivar t606952Cymbidium V71Y 514.9 31.88239 5.225 3.545302 hybrid cultivar t607197Cymbidium G84C 444.2 14.33806 4.825 0.262996 hybrid cultivar t607146Cymbidium G262T 399.375 11.43864 3.925 0.394757 hybrid cultivar t607017Cymbidium N78E 517.475 8.940311 8.9 1.75119 hybrid cultivar t607456Cymbidium T111Q 466.95 34.97175 5.375 0.512348 hybrid cultivar t606854Cymbidium L144I 581.275 44.29946 5.925 4.19871 hybrid cultivar t606838Cymbidium R123N 529.65 7.364102 6.85 0.896289 hybrid cultivar t607213Cymbidium T289G 416.8 8.30542 4.375 0.531507 hybrid cultivar t606932Cymbidium S18T 495.975 39.19034 6.075 1.189888 hybrid cultivar t607349Cymbidium K282D 419.1 9.083685 4.075 0.419325 hybrid cultivar t607585Cymbidium L83I 450.375 20.91226 5.175 0.287228 hybrid cultivar t606956Cymbidium L83M 475.55 9.733276 6.3 0.535413 hybrid cultivar t607586Cymbidium I326R 443.525 14.63361 5.025 0.330404 hybrid cultivar t607025Cymbidium I99T 473.65 23.0925 6.4 0.408248 hybrid cultivar t607322Cymbidium R123K 425.975 8.165935 4.6 0.616441 hybrid cultivar t606905Cymbidium V388A 489.825 32.71823 6.075 4.224827 hybrid cultivar t606835Cymbidium Q161F 562.35 16.93999 4.05 2.763452 hybrid cultivar t607104Cymbidium E323H 406.55 15.15443 4.85 0.74162 hybrid cultivar t607135Cymbidium E28P 392.8 21.08159 4.05 0.789515 hybrid cultivar t606891Cymbidium K356Q 484.225 22.33762 7.45 0.660808 hybrid cultivar t607031Cymbidium N80H 508.1 14.36825 7.05 0.613732 hybrid cultivar t607317Cymbidium C82E 472.45 13.27466 6.4 0.909212 hybrid cultivar t607088Cymbidium P303V 402.225 18.42487 3.225 0.826136 hybrid cultivar t607262Cymbidium L385M 467.625 36.03085 4.5 1.048809 hybrid cultivar t606896Cymbidium Q115S 500.675 15.10968 6.15 4.194043 hybrid cultivar t607269Cymbidium A333R 455.775 3.975236 5.15 0.597216 hybrid cultivar t607294Cymbidium K359M 441.65 4.773887 5.125 0.25 hybrid cultivar t607344Cymbidium A252E 421.125 19.75135 4.675 0.670199 hybrid cultivar t607159Cymbidium L76I 411.4 24.21955 4.05 0.660808 hybrid cultivar t606890Cymbidium K112R 519.575 20.66985 8.55 0.675771 hybrid cultivar t607284Cymbidium F70M 422.125 9.036731 3.925 0.419325 hybrid cultivar t607292Cymbidium V331I 439.375 22.82475 4.825 0.618466 hybrid cultivar t607476Cymbidium L20F 438.925 17.62978 4.075 0.464579 hybrid cultivar t606946Cymbidium N151P 484.525 10.50599 5.875 0.699405 hybrid cultivar t607320Cymbidium T111E 427.525 18.14412 5.55 1.021437 hybrid cultivar t607083Cymbidium G84S 379.5 14.5952 4.175 1.284199 hybrid cultivar t607480Cymbidium A13V 437.5 12.94656 4.85 0.759386 hybrid cultivar t606909Cymbidium W368H 496.825 40.68967 4.975 3.350995 hybrid cultivar t607449Cymbidium E203P 433.675 17.36613 4.35 0.331662 hybrid cultivar t606851Cymbidium I293V 514.65 43.13626 6.35 1.369915 hybrid cultivar t607079Cymbidium S34E 466.325 54.18348 7.175 1.25 hybrid cultivar t607282Cymbidium R100T 433.175 10.15099 4.525 0.655108 hybrid cultivar t606967Cymbidium M135A 446.85 27.16143 2.8 3.237283 hybrid cultivar t606938Cymbidium D88A 483.9 1.131371 6.05 0.070711 hybrid cultivar t607433Cymbidium L291V 424.875 12.5492 4.475 0.394757 hybrid cultivar t607357Cymbidium T243A 419.45 12.8108 4.5 0.294392 hybrid cultivar t607122Cymbidium F374L 381.9 18.88686 2.9 0.678233 hybrid cultivar t607110Cymbidium R317T 374.025 4.807199 3.575 0.221736 hybrid cultivar t607015Cymbidium G84T 514.575 14.52111 7.575 1.367175 hybrid cultivar t607087Cymbidium E96A 393.975 20.30195 4.375 1.004573 hybrid cultivar t607019Cymbidium I99A 462.175 18.38104 6.175 0.543906 hybrid cultivar t606839Cymbidium D207S 557.425 29.43245 7.425 0.818026 hybrid cultivar t606942Cymbidium R100P 461.85 7.707464 4.05 0.212132 hybrid cultivar t607164Cymbidium V327A 393.05 14.87672 3.225 0.35 hybrid cultivar t606906Cymbidium E28A 482 14.14214 6.85 0.494975 hybrid cultivar t606868Cymbidium L248I 555.5 18.09586 9 0.955685 hybrid cultivar t607089Cymbidium L77I 380.525 6.542871 4.125 1.170114 hybrid cultivar t606910Cymbidium N89P 471.825 23.09998 7.025 1.611159 hybrid cultivar t606936Cymbidium T111K 468.95 8.328465 5.125 0.287228 hybrid cultivar t606856Cymbidium V157T 551.4 18.24573 8.525 0.78475 hybrid cultivar t607450Cymbidium I99R 406.45 13.84786 3.5 0.637704 hybrid cultivar t606960Cymbidium N78R 456.45 26.30089 4.225 3.001527 hybrid cultivar t607600Cymbidium R123A 413.025 13.12945 4.675 0.457347 hybrid cultivar t606940Cymbidium A13S 481.65 28.77925 6.3 0.424264 hybrid cultivar t607085Cymbidium K55R 349.1 16.14745 3.125 0.543906 hybrid cultivar t607474Cymbidium G262A 397.225 2.348581 3.4 0.294392 hybrid cultivar t606959Cymbidium A219C 411.275 14.18506 2.25 0.544671 hybrid cultivar t606859Cymbidium I99E 511.05 22.442 7.325 0.543906 hybrid cultivar t606904Cymbidium S10R 467.15 52.82088 7.25 1.626346 hybrid cultivar t607195Cymbidium T111R 404.8 17.43885 4.25 0.493288 hybrid cultivar t607445Cymbidium P305N 341.875 14.41281 2.725 0.55 hybrid cultivar t607273Cymbidium V341P 395.325 9.541619 3.625 0.655108 hybrid cultivar t606834Cymbidium V157I 516.05 31.53966 6.625 1.192686 hybrid cultivar t607254Cymbidium G262T 418.575 14.71991 4.35 0.574456 hybrid cultivar t606828Cymbidium A107M 503.3 16.92986 5.5 1.191638 hybrid cultivar t606836Cymbidium K104Q 543.5 39.43129 7.1 1.174734 hybrid cultivar t607081Cymbidium I99G 347.025 12.30864 2.725 0.567891 hybrid cultivar t607482Cymbidium R24K 405.575 10.66908 4.175 0.434933 hybrid cultivar t607132Cymbidium D208A 328.9 30.48748 2.675 0.634429 hybrid cultivar t606874Cymbidium D208S 395.55 19.92327 4.2 0.6733 hybrid cultivar t607190Cymbidium L137M 292.8 8.173942 1.5 0.216025 hybrid cultivar t607028Cymbidium D208N 356.3 20.88349 3.125 0.377492 hybrid cultivar t607370Cymbidium L264F 297.2 7.708869 4.575 0.411299 hybrid cultivar t606898Cymbidium I102A 371.3 14.25669 2.75 0.834666 hybrid cultivar t607216Cymbidium Q206L 280.95 22.84637 2.325 0.359398 hybrid cultivar t606901Cymbidium D208C 351.525 23.67324 1.8 1.275408 hybrid cultivar t607256Cymbidium G373A 236.425 5.235376 0 0 hybrid cultivar t607131 CymbidiumN51D 191.275 227.01 1.9 2.941655 hybrid cultivar t607604 Cymbidium I99K203.675 235.3296 2.225 hybrid cultivar t606914 Cymbidium A13N 246.7285.4356 3.825 4.470925 hybrid cultivar t606934 Cymbidium S10N 230.525266.3877 2.975 3.435477 hybrid cultivar t607312 Cymbidium I255M 60.13.31763 0 0 hybrid cultivar t607377 Cymbidium S339W 0 0 0 0 hybridcultivar t607318 Cymbidium E28P F40Y 260.2 9.255269 3.3 0.702377 hybridN51D K54E cultivar N8OH C82E L83I G84C F86Y D88A N89P N92D F97M I99RT111E R123K L137M I147L N151P C176D S180N K182P Q206L H233N S237F A252ES260G G262A G281E T289K D367E t607245 Cymbidium E28P G84C 265.1 7.8132370.225 0.45 hybrid N89P T111R cultivar N151P C176D Q206L S237F A252EG262T G281E t607000 Cymbidium G373A 253.05 22.66282 0 0 hybrid F374Lcultivar t607337 Cymbidium E28P N51E 214.225 6.286162 1.6 0.141421hybrid N80H C82E cultivar L83I G84C F86Y D88A N89P N92D F97M I99R T111RR123K L137M I147L N151P C176D S180N K182P Q206L H233N S237F A252E S260GG262A G281E T289K D367E t607265 Cymbidium E28P F40Y 225.35 4.919011 1.80.374166 hybrid N51D K54E cultivar N80H C82E L83I G84C F86Y D88A N89PN92D F97M I99R T111Q R123K L137M I147L N151P C176D S180N K182P Q206LH233N S237F A252E S260G G262A G281E T289K D367E t607316 Cymbidium W301Y177.7 14.90526 2.075 0.801561 hybrid P303V cultivar P305N R308P A309Tt607435 Cymbidium E28P N51D 156.525 104.7564 1.075 0.813941 hybrid K54EN80H cultivar C82E L83I G84C F86Y D88A N89P N92D F97M I99R T111E R123KL137M I147L N151P C176D S180N K182P Q206L H233N S237F A252E S260G G262AG281E T289K D367E t607381 Cymbidium A266Y 4.825 9.65 0 0 hybrid P305Ncultivar S339W t607124 Cymbidium I255M 3.625 7.25 0 0 hybrid A266Ycultivar W301Y P303V P305N R308P A309T S339W V341P G373A F374L t607280Cymbidium A266Y 0 0 0 0 hybrid P305N cultivar S339W V341P G373A F374Lt607290 Cymbidium A266Y 0 0 0 0 hybrid S339W cultivar

TABLE 11B Results of Secondary Screen of Protein Engineering LibraryUsing C. sativa, Cochorus, and Cymbidium Templates (not supplementedwith sodium hexanoate) Average Standard Olivetol Deviation AverageStandard Normalized to Olivetol Wild-type Amino Acid NormalizedDeviation t606797_Cor- Normalized to template mutations from OlivetolNormalized chorus OLS t606797_Cor- Strain used wild-type (per OD)Olivetol (per OD) chorus OLS t527338 0.0048 0.0235 GFP t527340 Cannabiswild-type 1.05197 0.13292 Cannabis sativa OLS t606801 Cannabis S334P1.9025 0.12934 sativa t607067 Cannabis F367L 1.49374 0.07849 sativat607367 Cannabis G366A 1.32919 0.09047 sativa t606794 Cannabis wild-type1.30342 0.19421 sativa (Hemp) (Marijuana) t607391 Cannabis P298N 0.81320.03261 sativa t606984 Cannabis I248M 0.15063 0.02208 sativa t606899Cannabis S332W 0 0 sativa t606797 Corchorus wild-type 0.31178 0.05119 10.14396 Corchorus olitorius OLS t606807 Corchorus d1-8 Y142C 1.363121.64905 5.49321 6.64551 olitorius t606930 Corchorus d1-8 M255I 0.448150.04224 1.40843 0.13275 olitorius t607179 Corchorus Y301W 0.424280.08973 1.45657 0.30806 olitorius V302I V303T N305P P308K T309A t607332Corchorus d1-8 Y301W 0.39884 0.06665 1.15654 0.19327 olitorius V302IV303T N305P P308K T309A t607236 Corchorus M255I 0.37718 0.03147 1.275350.1064 olitorius t607006 Corchorus Y266F 0.2237 0.00789 0.72435 0.02554olitorius t606993 Corchorus d1-8 N305P 0.20738 0.02797 0.67149 0.09057olitorius t607139 Corchorus d1-8 Y266F 0.19496 0.02599 0.66932 0.08921olitorius t607158 Corchorus A373G 0.18139 0.0179 0.62271 0.06145olitorius t607153 Corchorus d1-8 A373G 0.16382 0.01353 0.56242 0.04646olitorius t606852 Corchorus N305P 0.16244 0.00836 0.65462 0.03369olitorius t607112 Corchorus W339S 0 0 0 0 olitorius t607119 Corchorusd1-8 L374F 0 0 0 0 olitorius t607141 Corchorus d1-8 Y266F 0 0 0 0olitorius W339S t607149 Corchorus d1-8 W339S 0 0 0 0 olitorius t607176Corchorus L374F 0 0 0 0 olitorius t607193 Corchorus d1-8 T12Y 0 0 0 0olitorius F39Y Q42R L43A Q47E Q51D Q57K I77L G79E S84C E96D T100E L121KN123K A135V A137M T139G H143Q N146K K151R H152P K156R F158M S174A V182RD183G S184A N231T K232N I241V T253D C260G M287E M353R Q357E S395Nt607371 Corchorus Y266F 0 0 0 0 olitorius W3395 t527346 Cymbidiumwild-type 1.02322 0.20846 Cymbidium hybrid OLS cultivar t607221Cymbidium N92D 1.67744 0.07244 hybrid cultivar t607228 Cymbidium M135I1.64252 0.01538 hybrid cultivar t606878 Cymbidium P303A 1.56942 0.06675hybrid cultivar t606986 Cymbidium E323G 1.56671 0.16314 hybrid cultivart606999 Cymbidium N151P 1.5521 0.09794 hybrid cultivar t607224 CymbidiumY142C 1.54592 0.10755 hybrid cultivar t606976 Cymbidium Q274K 1.542740.07232 hybrid cultivar t607241 Cymbidium V388T 1.53854 0.07404 hybridcultivar t607603 Cymbidium Y142I 1.53604 0.11347 hybrid cultivar t607222Cymbidium N11K 1.52325 0.04008 hybrid cultivar t607014 Cymbidium A287M1.49053 0.07415 hybrid cultivar t606994 Cymbidium T289A 1.46316 0.12925hybrid cultivar t606982 Cymbidium V314I 1.46045 0.08295 hybrid cultivart606995 Cymbidium I147L 1.4549 0.07909 hybrid cultivar t607007 CymbidiumG281E 1.45405 0.06194 hybrid cultivar t607008 Cymbidium 1390Q 1.452780.0047 hybrid cultivar t606965 Cymbidium Y142V 1.42708 0.18261 hybridcultivar t607107 Cymbidium A79E 1.4258 0.07329 hybrid cultivar t607194Cymbidium F86Y 1.42238 0.02094 hybrid cultivar t606981 Cymbidium E96R1.42122 0.04842 hybrid cultivar t606979 Cymbidium E32R 1.416 0.12096hybrid cultivar t606975 Cymbidium K182P 1.4052 0.03345 hybrid cultivart607230 Cymbidium E116D 1.40132 0.0774 hybrid cultivar t607004 CymbidiumL291Y 1.40103 0.06311 hybrid cultivar t606996 Cymbidium T170A 1.385740.05793 hybrid cultivar t607046 Cymbidium K359R 1.38299 0.01508 hybridcultivar t607043 Cymbidium N45T 1.38195 0.03083 hybrid cultivar t607021Cymbidium H69Y 1.37598 0.10301 hybrid cultivar t607109 Cymbidium D61R1.37011 0.10414 hybrid cultivar t607036 Cymbidium K321E 1.36745 0.04988hybrid cultivar t606912 Cymbidium D14P 1.36318 0.09371 hybrid cultivart607602 Cymbidium R100E 1.35489 0.07769 hybrid cultivar t607361Cymbidium E323Q 1.35416 0.07809 hybrid cultivar t606882 Cymbidium L222I1.35377 0.07907 hybrid cultivar t606962 Cymbidium D229E 1.34991 0.03688hybrid cultivar t607150 Cymbidium E285K 1.3445 0.07571 hybrid cultivart607252 Cymbidium F40Y 1.34329 0.11563 hybrid cultivar t607225 CymbidiumS260G 1.34171 0.08715 hybrid cultivar t607032 Cymbidium K355S 1.341460.08572 hybrid cultivar t607248 Cymbidium E155C 1.3346 0.03904 hybridcultivar t607155 Cymbidium C176D 1.33173 0.01853 hybrid cultivar t606958Cymbidium C82N 1.32771 0.01591 hybrid cultivar t607023 Cymbidium V71H1.32703 0.0343 hybrid cultivar t607027 Cymbidium T289K 1.32433 0.08328hybrid cultivar t606892 Cymbidium Y142T 1.31787 0.21329 hybrid cultivart607035 Cymbidium Y160G 1.31694 0.07538 hybrid cultivar t607237Cymbidium T289S 1.31692 0.10049 hybrid cultivar t607189 Cymbidium N51E1.31519 0.10593 hybrid cultivar t607118 Cymbidium T289Q 1.31279 0.05154hybrid cultivar t607018 Cymbidium I390N 1.31146 0.00884 hybrid cultivart607045 Cymbidium F30C 1.31145 0.13007 hybrid cultivar t606888 CymbidiumA107L 1.31126 0.05185 hybrid cultivar t607220 Cymbidium A13T 1.30730.0685 hybrid cultivar t606830 Cymbidium M135V 1.30213 0.13463 hybridcultivar t606832 Cymbidium E155Q 1.29883 0.18596 hybrid cultivar t607601Cymbidium K121V 1.29806 0.0748 hybrid cultivar t606857 Cymbidium A195V1.29426 0.3015 hybrid cultivar t607452 Cymbidium K54E 1.29134 0.03462hybrid cultivar t607218 Cymbidium F97M 1.29082 0.07837 hybrid cultivart607186 Cymbidium S180N 1.28937 0.10055 hybrid cultivar t607123Cymbidium S237F 1.28824 0.07181 hybrid cultivar t607286 Cymbidium S34Q1.28672 0.12162 hybrid cultivar t606918 Cymbidium V50N 1.28475 0.02954hybrid cultivar t606916 Cymbidium R100A 1.28388 0.03268 hybrid cultivart606990 Cymbidium L291W 1.27925 0.06685 hybrid cultivar t606908Cymbidium I22M 1.27921 0.08525 hybrid cultivar t606963 Cymbidium I147E1.27917 0.09902 hybrid cultivar t607226 Cymbidium Q115D 1.27909 0.05228hybrid cultivar t606961 Cymbidium V43M 1.27574 0.04471 hybrid cultivart607260 Cymbidium E285A 1.27099 0.05173 hybrid cultivar t607160Cymbidium V341A 1.26952 0.05458 hybrid cultivar t607156 Cymbidium H269S1.26921 0.0896 hybrid cultivar t607478 Cymbidium N11R 1.26856 0.08255hybrid cultivar t606887 Cymbidium Q274G 1.26839 0.09424 hybrid cultivart606861 Cymbidium E203K 1.26653 0.2508 hybrid cultivar t607217 CymbidiumD367E 1.26643 0.04114 hybrid cultivar t606894 Cymbidium Y142F 1.265860.09867 hybrid cultivar t607288 Cymbidium T289D 1.26389 0.05999 hybridcultivar t606952 Cymbidium V71Y 1.26229 0.12989 hybrid cultivar t607197Cymbidium G84C 1.26224 0.07225 hybrid cultivar t607146 Cymbidium G262T1.26188 0.03879 hybrid cultivar t607017 Cymbidium N78E 1.26073 0.09115hybrid cultivar t607456 Cymbidium T111Q 1.25899 0.07745 hybrid cultivart606854 Cymbidium L144I 1.25807 0.22743 hybrid cultivar t606838Cymbidium R123N 1.25683 0.23657 hybrid cultivar t607213 Cymbidium T289G1.25405 0.08316 hybrid cultivar t606932 Cymbidium S18T 1.25389 0.12779hybrid cultivar t607349 Cymbidium K282D 1.25244 0.08104 hybrid cultivart607585 Cymbidium L83I 1.2509 0.04892 hybrid cultivar t606956 CymbidiumL83M 1.24841 0.05473 hybrid cultivar t607586 Cymbidium I326R 1.247840.04812 hybrid cultivar t607025 Cymbidium I99T 1.24421 0.09819 hybridcultivar t607322 Cymbidium R123K 1.24342 0.08106 hybrid cultivar t606905Cymbidium V388A 1.24241 0.09318 hybrid cultivar t606835 Cymbidium Q161F1.24193 0.10262 hybrid cultivar t607104 Cymbidium E323H 1.24084 0.05988hybrid cultivar t607135 Cymbidium E28P 1.24063 0.06484 hybrid cultivart606891 Cymbidium K356Q 1.24008 0.09459 hybrid cultivar t607031Cymbidium N80H 1.23893 0.09737 hybrid cultivar t607317 Cymbidium C82E1.23849 0.03065 hybrid cultivar t607088 Cymbidium P303V 1.23838 0.07685hybrid cultivar t607262 Cymbidium L385M 1.23811 0.14991 hybrid cultivart606896 Cymbidium Q115S 1.2378 0.10758 hybrid cultivar t607269 CymbidiumA333R 1.23132 0.05599 hybrid cultivar t607294 Cymbidium K359M 1.229230.04507 hybrid cultivar t607344 Cymbidium A252E 1.22825 0.1572 hybridcultivar t607159 Cymbidium L76I 1.22743 0.10259 hybrid cultivar t606890Cymbidium K112R 1.22535 0.04006 hybrid cultivar t607284 Cymbidium F70M1.22458 0.03416 hybrid cultivar t607292 Cymbidium V331I 1.22387 0.05296hybrid cultivar t607476 Cymbidium L20F 1.22172 0.06972 hybrid cultivart606946 Cymbidium N151P 1.22164 0.0856 hybrid cultivar t607320 CymbidiumT111E 1.22064 0.07826 hybrid cultivar t607083 Cymbidium G84S 1.215420.11127 hybrid cultivar t607480 Cymbidium A13V 1.21428 0.00581 hybridcultivar t606909 Cymbidium W368H 1.21403 0.1423 hybrid cultivar t607449Cymbidium E203P 1.21329 0.09644 hybrid cultivar t606851 Cymbidium I293V1.21178 0.19419 hybrid cultivar t607079 Cymbidium S34E 1.21014 0.05037hybrid cultivar t607282 Cymbidium R100T 1.20952 0.03641 hybrid cultivart606967 Cymbidium M135A 1.20347 0.12224 hybrid cultivar t606938Cymbidium D88A 1.19819 0.02307 hybrid cultivar t607433 Cymbidium L291V1.19755 0.06191 hybrid cultivar t607357 Cymbidium T243A 1.19607 0.10095hybrid cultivar t607122 Cymbidium F374L 1.19551 0.12138 hybrid cultivart607110 Cymbidium R317T 1.19291 0.05771 hybrid cultivar t607015Cymbidium G84T 1.19021 0.12242 hybrid cultivar t607087 Cymbidium E96A1.18792 0.06196 hybrid cultivar t607019 Cymbidium I99A 1.18519 0.12123hybrid cultivar t606839 Cymbidium D207S 1.17855 0.17153 hybrid cultivart606942 Cymbidium R100P 1.17753 0.09144 hybrid cultivar t607164Cymbidium V327A 1.17683 0.0485 hybrid cultivar t606906 Cymbidium E28A1.1681 0.05241 hybrid cultivar t606868 Cymbidium L248I 1.16589 0.03553hybrid cultivar t607089 Cymbidium L77I 1.16044 0.02039 hybrid cultivart606910 Cymbidium N89P 1.15844 0.18953 hybrid cultivar t606936 CymbidiumT111K 1.15338 0.07375 hybrid cultivar t606856 Cymbidium V157T 1.148040.06609 hybrid cultivar t607450 Cymbidium I99R 1.13733 0.02723 hybridcultivar t606960 Cymbidium N78R 1.13483 0.06431 hybrid cultivar t607600Cymbidium R123A 1.13448 0.08691 hybrid cultivar t606940 Cymbidium A13S1.13438 0.04862 hybrid cultivar t607085 Cymbidium K55R 1.12814 0.06807hybrid cultivar t607474 Cymbidium G262A 1.12724 0.09746 hybrid cultivart606959 Cymbidium A219C 1.1175 0.07755 hybrid cultivar t606859 CymbidiumI99E 1.10483 0.19373 hybrid cultivar t606904 Cymbidium S10R 1.104110.13592 hybrid cultivar t607195 Cymbidium T111R 1.10148 0.06594 hybridcultivar t607445 Cymbidium P305N 1.0924 0.04527 hybrid cultivar t607273Cymbidium V341P 1.09021 0.01522 hybrid cultivar t606834 Cymbidium V157I1.07903 0.07146 hybrid cultivar t607254 Cymbidium G262T 1.07703 0.05546hybrid cultivar t606828 Cymbidium A107M 1.07407 0.12564 hybrid cultivart606836 Cymbidium K104Q 1.07292 0.1499 hybrid cultivar t607081 CymbidiumI99G 1.0581 0.0688 hybrid cultivar t607482 Cymbidium R24K 1.054090.07847 hybrid cultivar t607132 Cymbidium D208A 1.01083 0.07712 hybridcultivar t606874 Cymbidium D208S 0.95301 0.04077 hybrid cultivar t607190Cymbidium L137M 0.90492 0.02932 hybrid cultivar t607028 Cymbidium D208N0.88114 0.04406 hybrid cultivar t607370 Cymbidium L264F 0.87471 0.07388hybrid cultivar t606898 Cymbidium I102A 0.85753 0.12759 hybrid cultivart607216 Cymbidium Q206L 0.83138 0.05303 hybrid cultivar t606901Cymbidium D208C 0.82507 0.10608 hybrid cultivar t607256 Cymbidium G373A0.65595 0.02379 hybrid cultivar t607131 Cymbidium N51D 0.5808 0.70692hybrid cultivar t607604 Cymbidium I99K 0.57364 0.66547 hybrid cultivart606914 Cymbidium A13N 0.56823 0.65663 hybrid cultivar t606934 CymbidiumS10N 0.54561 0.63028 hybrid cultivar t607312 Cymbidium I255M 0.169840.01174 hybrid cultivar t607377 Cymbidium S339W 0 0 hybrid cultivart607318 Cymbidium E28P F40Y 0.78529 0.05769 hybrid N51D K54E cultivarN80H C82E L83I G84C F86Y D88A N89P N92D F97M I99R T111E R123K L137MI147L N151P C176D S180N K182P Q206L H233N S237F A252E S260G G262A G281ET289K D367E t607245 Cymbidium E28P G84C 0.77803 0.02842 hybrid N89PT111R cultivar N151P C176D Q206L S237F A252E G262T G281E t607000Cymbidium G373A 0.71402 0.06528 hybrid F374L cultivar t607337 CymbidiumE28P N51E 0.66352 0.03712 hybrid N80H C82E cultivar L83I G84C F86Y D88AN89P N92D F97M I99R T111R R123K L137M I147L N151P C176D S180N K182PQ206L H233N S237F A252E S260G G262A G281E T289K D367E t607265 CymbidiumE28P F40Y 0.64379 0.0247 hybrid N51D K54E cultivar N80H C82E L83I G84CF86Y D88A N89P N92D F97M I99R T111Q R123K L137M I147L N151P C176D S180NK182P Q206L H233N S237F A252E S260G G262A G281E T289K D367E t607316Cymbidium W301Y 0.49342 0.01579 hybrid P303V cultivar P305N R308P A309Tt607435 Cymbidium E28P N51D 0.46307 0.30921 hybrid K54E N80H cultivarC82E L83I G84C F86Y D88A N89P N92D F97M I99R T111E R123K L137M I147LN151P C176D S180N K182P Q206L H233N S237F A252E S260G G262A G281E T289KD367E t607381 Cymbidium A266Y 0.01389 0.02777 hybrid P305N cultivarS339W t607124 Cymbidium I255M 0.01165 0.02329 hybrid A266Y cultivarW301Y P303V P305N R308P A309T S339W V341P G373A F374L t607280 CymbidiumA266Y 0 0 hybrid P305N cultivar S339W V341P G373A F374L t607290Cymbidium A266Y 0 0 hybrid S339W cultivar

TABLE 12 Screening Results in Prototrophic S. cerevisiae strain StandardAverage Olivetol Olivetol Deviation Strain Strain type [μg/L] [μg/L]t473139 Negative Control 0 0 t496101 Cannabis OLS variant 20254.132236.483 (positive control) t496102 Library 20566.05 2055.026 t485668Library 24062.45 4250.129 t496079 Library 29485.08 2786.913 t485662Library 50257.28 3891.439 t496084 t496084 Cannabis OLS 53595.65 7035.556T335C point mutant t485672 Library 53606.37 6230.06 t496073 Library56729.84 4435.122

TABLE 13 Sequence Information for Strains described in Table 9Nucleotide Sequence Protein Sequence Strain (SEQ ID NO) (SEQ ID NO)t405417 250 207 t404953 251 208 t405220 252 209 t404192 253 210 t404323254 211 t404196 255 212 t404209 256 213 t404164 257 214 t404170 258 215t404384 259 216 t405397 260 217 t405164 261 218 t404191 262 219 t405340263 220 t404421 264 221 t404631 265 222 t405133 266 223 t405081 267 224t404898 268 225 t405017 269 226 t405140 270 227 t404276 271 228 t404405272 229 t405079 273 230 t404978 274 231 t405347 275 232 t404855 276 233t405362 277 234 t404523 278 235 t404951 279 236 t405308 280 237 t405201281 238 t404219 282 239 t404673 283 240 t404274 284 241 t405042 285 242t404528 286 243 t405312 287 244 t404725 288 245 t405303 289 246 t405395290 247 t405326 291 248 t404599 292 249

TABLE 14 Sequence Information for Strains Described in Tables 10A-10BNucleotide Sequence Protein Sequence Strain (SEQ ID NO) (SEQ ID NO)t606794 96 80 t527340 62 5 t607067 421 293 t607367 422 294 t607391 423295 t606801 424 296 t606984 425 297 t606899 426 298 t606797 37 6 t606807427 299 t607179 428 300 t607149 429 301 t607139 430 302 t607112 431 303t607332 432 304 t607153 433 305 t607158 434 306 t607236 435 307 t607141436 308 t607176 437 309 t606930 438 310 t607193 439 311 t607006 440 312t606993 441 313 t606852 442 314 t607119 443 315 t607371 444 316 t52734638 7 t606952 445 317 t607284 446 318 t607262 447 319 t606938 448 320t607260 449 321 t607159 450 322 t606946 451 323 t606861 452 324 t606918453 325 t607135 454 326 t607286 455 327 t606942 456 328 t606959 457 329t607294 458 330 t607282 459 331 t607230 460 332 t606965 461 333 t607288462 334 t607228 463 335 t606909 464 336 t606962 465 337 t607150 466 338t607361 467 339 t606932 468 340 t606940 469 341 t607269 470 342 t607186471 343 t607476 472 344 t607031 473 345 t606916 474 346 t607292 475 347t606908 476 348 t607248 477 349 t607023 478 350 t606936 479 351 t607433480 352 t607600 481 353 t606894 482 354 t606963 483 355 t607603 484 356t607452 485 357 t607197 486 358 t606996 487 359 t607043 488 360 t607254489 361 t607478 490 362 t607132 491 363 t607109 492 364 t607155 493 365t606956 494 366 t606906 495 367 t607195 496 368 t607449 497 369 t607256498 370 t607349 499 371 t606960 500 372 t607601 501 373 t607021 502 374t606874 503 375 t607320 504 376 t607317 505 377 t607224 506 378 t606912507 379 t607602 508 380 t606905 509 381 t607156 510 382 t607474 511 383t607482 512 384 t606854 513 385 t607032 514 386 t606830 515 387 t606961516 388 t606868 517 389 t607083 518 390 t606958 519 391 t607273 520 392t607241 521 393 t606857 522 394 t606901 523 395 t607015 524 396 t607586525 397 t607122 526 398 t606882 527 399 t607146 528 400 t607585 529 401t606828 530 402 t607081 531 403 t606887 532 404 t606891 533 405 t607160534 406 t607194 535 407 t607377 537 409 t607265 538 410 t607000 539 411t607245 540 412 t607318 541 413 t607435 542 414 t607337 543 415 t607316544 416 t607124 545 417 t607280 546 418 t607290 547 419 t607381 548 420

TABLE 15 Sequence Information for Strains Described in Tables 11A-11BNucleotide Sequence Protein Sequence Strain (SEQ ID NO) (SEQ ID NO)t527340 62 5 t606801 424 296 t607067 421 293 t607367 422 294 t606794 9680 t607391 423 295 t606984 425 297 t606899 426 298 t606797 37 6 t606807427 299 t606930 438 310 t607179 428 300 t607332 432 304 t607236 435 307t607006 440 312 t606993 441 313 t607139 430 302 t607158 434 306 t607153433 305 t606852 442 314 t607112 431 303 t607119 443 315 t607141 436 308t607149 429 301 t607176 437 309 t607193 439 311 t607371 444 316 t52734638 7 t607221 628 549 t607228 463 335 t606878 629 550 t606986 630 551t606999 631 552 t607224 506 378 t606976 632 553 t607241 521 393 t607603484 356 t607222 633 554 t607014 634 555 t606994 635 556 t606982 636 557t606995 637 558 t607007 638 559 t607008 639 560 t606965 461 333 t607107640 561 t607194 535 407 t606981 641 562 t606979 642 563 t606975 643 564t607230 460 332 t607004 644 565 t606996 487 359 t607046 645 566 t607043488 360 t607021 502 374 t607109 492 364 t607036 646 567 t606912 507 379t607602 508 380 t607361 467 339 t606882 527 399 t606962 465 337 t607150466 338 t607252 647 568 t607225 648 569 t607032 514 386 t607248 477 349t607155 493 365 t606958 519 391 t607023 478 350 t607027 649 570 t606892650 571 t607035 651 572 t607237 652 573 t607189 653 574 t607118 654 575t607018 655 576 t607045 656 577 t606888 657 578 t607220 658 579 t606830515 387 t606832 659 580 t607601 501 373 t606857 522 394 t607452 485 357t607218 660 581 t607186 471 343 t607123 661 582 t607286 455 327 t606918453 325 t606916 474 346 t606990 662 583 t606908 476 348 t606963 483 355t607226 663 584 t606961 516 388 t607260 449 321 t607160 534 406 t607156510 382 t607478 490 362 t606887 532 404 t606861 452 324 t607217 664 585t606894 482 354 t607288 462 334 t606952 445 317 t607197 486 358 t607146528 400 t607017 665 586 t607456 666 587 t606854 513 385 t606838 667 588t607213 668 589 t606932 468 340 t607349 499 371 t607585 529 401 t606956494 366 t607586 525 397 t607025 669 590 t607322 670 591 t606905 509 381t606835 671 592 t607104 672 593 t607135 454 326 t606891 533 405 t607031473 345 t607317 505 377 t607088 673 594 t607262 447 319 t606896 674 595t607269 470 342 t607294 458 330 t607344 675 596 t607159 450 322 t606890676 597 t607284 446 318 t607292 475 347 t607476 472 344 t606946 451 323t607320 504 376 t607083 518 390 t607480 677 598 t606909 464 336 t607449497 369 t606851 678 599 t607079 679 600 t607282 459 331 t606967 680 601t606938 448 320 t607433 480 352 t607357 681 602 t607122 526 398 t607110682 603 t607015 524 396 t607087 683 604 t607019 684 605 t606839 685 606t606942 456 328 t607164 686 607 t606906 495 367 t606868 517 389 t607089687 608 t606910 688 609 t606936 479 351 t606856 689 610 t607450 690 611t606960 500 372 t607600 481 353 t606940 469 341 t607085 691 612 t607474511 383 t606959 457 329 t606859 692 613 t606904 693 614 t607195 496 368t607445 694 615 t607273 520 392 t606834 695 616 t607254 489 361 t606828530 402 t606836 696 617 t607081 531 403 t607482 512 384 t607132 491 363t606874 503 375 t607190 697 618 t607028 698 619 t607370 699 620 t606898700 621 t607216 701 622 t606901 523 395 t607256 498 370 t607131 702 623t607604 703 624 t606914 704 625 t606934 705 626 t607312 536 408 t607377537 409 t607318 541 413 t607245 540 412 t607000 539 411 t607337 543 415t607265 538 410 t607316 544 416 t607435 542 414 t607381 548 420 t607124545 417 t607280 546 418 t607290 547 419

TABLE 16 Sequence Information for Strains Described in Table 12Nucleotide Sequence Amino Acid Sequence Strain (SEQ ID NO) (SEQ ID NO)t496101 62 5 t496102 39 8 t485668 44 13 t496079 47 16 t485662 46 15t496084 706 627 t485672 48 17 t496073 38 7

EQUIVALENTS

Those skilled in the art will recognize, or be able to ascertain usingno more than routine experimentation, many equivalents to the specificembodiments of the invention described herein. Such equivalents areintended to be encompassed by the following claims.

All references, including patent documents, disclosed herein areincorporated by reference in their entirety.

1. A host cell that comprises a heterologous polynucleotide encoding apolyketide synthase (PKS), wherein the PKS comprises an amino acidsequence that has at least 90% sequence identity to any one of SEQ IDNOs: 7, 15, 145, and 714, or wherein the PKS comprises a conservativelysubstituted version of any one of SEQ ID NOs: 7, 15, 145, and 714, andwherein the host cell further comprises one or more heterologouspolynucleotides encoding one or more of: a polyketide cyclase (PKC), aprenyltransferase (PT), and/or a terminal synthase (TS).
 2. The hostcell of claim 1, wherein relative to the sequence of SEQ ID NO: 7, thePKS comprises an amino acid substitution at a residue corresponding toposition 28, 34, 50, 70, 71, 76, 88, 100, 151, 203, 219, 285, 359,and/or 385 in SEQ ID NO:
 7. 3. The host cell of claim 1, wherein the PKScomprises: a) the amino acid P at a residue corresponding to position 28in SEQ ID NO: 7; b) the amino acid Q at a residue corresponding toposition 34 in SEQ ID NO: 7; c) the amino acid N at a residuecorresponding to position 50 in SEQ ID NO: 7; d) the amino acid M at aresidue corresponding to position 70 in SEQ ID NO: 7; e) the amino acidY at a residue corresponding to position 71 in SEQ ID NO: 7; f) theamino acid I at a residue corresponding to position 76 in SEQ ID NO: 7;g) the amino acid A at a residue corresponding to position 88 in SEQ IDNO: 7; h) the amino acid P or T at a residue corresponding to position100 in SEQ ID NO: 7; i) the amino acid P at a residue corresponding toposition 151 in SEQ ID NO: 7; j) the amino acid K at a residuecorresponding to position 203 in SEQ ID NO: 7; k) the amino acid C at aresidue corresponding to position 219 in SEQ ID NO: 7; l) the amino acidA at a residue corresponding to position 285 in SEO ID NO: 7; m) theamino acid M at a residue corresponding to position 359 in SEQ ID NO: 7;and/or n) the amino acid M at a residue corresponding to position 385 inSEQ ID NO:
 7. 4-71. (canceled)
 72. The host cell of claim 1, wherein thehost cell is capable of producing a cannabinoid compound or acannabinoid precursor, wherein the cannabinoid compound is a compound ofFormulas 8, 9, 10, or 11:

or a pharmaceutically acceptable salt, solvate, hydrate, polymorph,co-crystal, tautomer, stereoisomer, isotopically labeled derivative, orprodrug thereof; wherein the cannabinoid precursor is a compound ofFormulas 4, 5, or 6:

or a pharmaceutically acceptable salt, solvate, hydrate, polymorph,co-crystal, tautomer, stereoisomer, isotopically labeled derivative, orprodrug thereof; and wherein R is straight-chain unsubstituted C₁₋₂₀alkyl.
 73. The host cell of claim 72, wherein the host cell is capableof producing 3,5,7-trioxododecanoyl-CoA, olivetol, olivetolic acid,cannabigerolic acid (8a), cannabidiolic acid (9a),tetrahydrocannabinolic acid (10a), and/or cannabichromenic acid (11a).74. The host cell of claim 73, wherein the host cell produces more3,5,7-trioxododecanoyl-CoA, olivetol, and/or olivetolic acid than a hostcell that: (i) does not comprise the PKS that comprises the amino acidsequence that has at least 90% sequence identity to any one of SEQ IDNOs: 7, 15, 145, and 714 and (ii) comprises a heterologouspolynucleotide encoding a PKS that comprises the amino acid sequence ofSEQ ID NO:
 5. 75. The host cell of claim 1, wherein the PKS comprisesone or more of the following amino acid substitutions relative to SEQ IDNO: 7: V71Y and F70M.
 76. The host cell of claim 1, wherein the PKScomprises: a) C at a residue corresponding to position 164 in SEQ ID NO:7; b) H at a residue corresponding to position 304 in SEQ ID NO: 7;and/or c) N at a residue corresponding to position 337 in SEQ ID NO: 7.77. The host cell of claim 1, wherein the PKS comprises the amino acidsequence of any one of SEQ ID NOs: 7, 15, 145, and
 714. 78. The hostcell of claim 1, wherein the heterologous polynucleotide comprises anucleotide sequence that has at least 90% sequence identity to any oneof SEQ ID NOs: 38, 175, 176, and 205, or a codon degenerate nucleotidesequence thereof.
 79. The host cell of claim 1, wherein the host cell isa yeast cell.
 80. The host cell of claim 79, wherein the yeast cell is aSaccharomyces cell, a Yarrowia cell, a Pichia cell or a Komagataellacell.
 81. The host cell of claim 80, wherein the Saccharomyces cell is aSaccharomyces cerevisiae cell.
 82. The host cell of claim 1, wherein thePKC is an olivetolic acid cyclase (OAC).
 83. The host cell of claim 72,wherein R is a straight-chain unsubstituted C₁₋₁₀ alkyl.
 84. The hostcell of claim 72, wherein R is a straight-chain unsubstituted C₃ or C₅alkyl.
 85. A method comprising culturing the host cell of claim
 1. 86. Amethod for producing a cannabinoid compound or a cannabinoid precursor,wherein the method comprises culturing a host cell that comprises aheterologous polynucleotide encoding a polyketide synthase (PKS),wherein the PKS comprises an amino acid sequence that has at least 90%identity to SEQ ID NO: 7, 15, 145, or 714 and wherein the cannabinoidcompound is a compound of Formulas 8, 9, 10, or 11:

or a pharmaceutically acceptable salt, solvate, hydrate, polymorph,co-crystal, tautomer, stereoisomer, isotopically labeled derivative, orprodrug thereof; wherein the cannabinoid precursor is a compound ofFormulas 4, 5, or 6:

or a pharmaceutically acceptable salt, solvate, hydrate, polymorph,co-crystal, tautomer, stereoisomer, isotopically labeled derivative, orprodrug thereof; and wherein R is straight-chain unsubstituted C₁₋₂₀alkyl.
 87. A bioreactor for producing a cannabinoid compound or acannabinoid precursor containing: a. malonyl-CoA; b. an optionallysubstituted alkanoic acid; and c. a polyketide synthase (PKS) comprisingan amino acid sequence that has at least 90% sequence identity to anyone of SEQ ID NOs: 7, 715, 145, and 714, or a conservatively substitutedversion thereof.
 88. The bioreactor of claim 87, further comprising anacyl activating enzyme (AAE), a polyketide cyclase (PKC), aprenyltransferase (PT), and/or a terminal synthase (TS).